Automated Linking of Historical Data
Leah Boustan (),
Katherine Eriksson (),
James Feigenbaum () and
Santiago Perez ()
No 25825, NBER Working Papers from National Bureau of Economic Research, Inc
The recent digitization of complete count census data is an extraordinary opportunity for social scientists to create large longitudinal datasets by linking individuals from one census to another or from other sources to the census. We evaluate different automated methods for record linkage, performing a series of comparisons across methods and against hand linking. We have three main findings that lead us to conclude that automated methods perform well. First, a number of automated methods generate very low (less than 5%) false positive rates. The automated methods trace out a frontier illustrating the tradeoff between the false positive rate and the (true) match rate. Relative to more conservative automated algorithms, humans tend to link more observations but at a cost of higher rates of false positives. Second, when human linkers and algorithms have the same amount of information, there is relatively little disagreement between them. Third, across a number of plausible analyses, coefficient estimates and parameters of interest are very similar when using linked samples based on each of the different automated methods. We provide code and Stata commands to implement the various automated methods.
JEL-codes: C81 N0 (search for similar items in EconPapers)
New Economics Papers: this item is included in nep-bec, nep-big, nep-his, nep-ore and nep-pay
Note: DAE TWP
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (11) Track citations by RSS feed
Downloads: (external link)
Access to the full text is generally limited to series subscribers, however if the top level domain of the client browser is in a developing country or transition economy free access is provided. More information about subscriptions and free access is available at http://www.nber.org/wwphelp.html. Free access is also available to older working papers.
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:nbr:nberwo:25825
Ordering information: This working paper can be ordered from
The price is Paper copy available by mail.
Access Statistics for this paper
More papers in NBER Working Papers from National Bureau of Economic Research, Inc National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.. Contact information at EDIRC.
Bibliographic data for series maintained by ().