000 02828cam a22003737 4500
001 w24324
003 NBER
005 20211020104634.0
006 m o d
007 cr cnu||||||||
008 210910s2018 mau fo 000 0 eng d
100 1 _aAbramitzky, Ran.
_933381
245 1 0 _aLinking Individuals Across Historical Sources:
_ba Fully Automated Approach /
_cRan Abramitzky, Roy Mill, Santiago Pérez.
260 _aCambridge, Mass.
_bNational Bureau of Economic Research
_c2018.
300 _a1 online resource:
_billustrations (black and white);
490 1 _aNBER working paper series
_vno. w24324
500 _aFebruary 2018.
520 3 _aLinking individuals across historical datasets relies on information such as name and age that is both non-unique and prone to enumeration and transcription errors. These errors make it impossible to find the correct match with certainty. In the first part of the paper, we suggest a fully automated probabilistic method for linking historical datasets that enables researchers to create samples at the frontier of minimizing type I (false positives) and type II (false negatives) errors. The first step guides researchers in the choice of which variables to use for linking. The second step uses the Expectation-Maximization (EM) algorithm, a standard tool in statistics, to compute the probability that each two records correspond to the same individual. The third step suggests how to use these estimated probabilities to choose which records to use in the analysis. In the second part of the paper, we apply the method to link historical population censuses in the US and Norway, and use these samples to estimate measures of intergenerational occupational mobility. The estimates using our method are remarkably similar to the ones using IPUMS', which relies on hand linking to create a training sample. We created an R code and a Stata command that implement this method.
530 _aHardcopy version available to institutional subscribers
538 _aSystem requirements: Adobe [Acrobat] Reader required for PDF files.
538 _aMode of access: World Wide Web.
588 0 _aPrint version record
690 7 _aC10 - General
_2Journal of Economic Literature class.
690 7 _aJ01 - Labor Economics: General
_2Journal of Economic Literature class.
690 7 _aJ10 - General
_2Journal of Economic Literature class.
690 7 _aN00 - General
_2Journal of Economic Literature class.
700 1 _aMill, Roy.
700 1 _aPérez, Santiago.
710 2 _aNational Bureau of Economic Research.
830 0 _aWorking Paper Series (National Bureau of Economic Research)
_vno. w24324.
856 4 0 _uhttps://www.nber.org/papers/w24324
856 _yAcceso en línea al DOI
_uhttp://dx.doi.org/10.3386/w24324
942 _2ddc
_cW-PAPER
999 _c323798
_d282360