EconPapers    
Economics at your fingertips  
 

How Well Do Automated Linking Methods Perform? Lessons from U.S. Historical Data

Martha Bailey, Connor Cole, Morgan Henderson and Catherine Massey

No 24019, NBER Working Papers from National Bureau of Economic Research, Inc

Abstract: This paper reviews the literature in historical record linkage in the U.S. and examines the performance of widely-used automated record linking algorithms in two high-quality historical datasets and one synthetic ground truth. Focusing on algorithms in current practice, our findings highlight the important effects of linking methods on data quality. We find that (1) no method (including hand-linking) consistently produces representative samples; (2) 15 to 37 percent of links chosen by prominent machine linking algorithms are identified as false links by human reviewers; and (3) these false links are systematically related to baseline sample characteristics, suggesting that machine algorithms may introduce complicated forms of bias into analyses. We find that prominent linking algorithms attenuate estimates of the intergenerational income elasticity by up to 20 percent and common variations in algorithm choices result in greater attenuation. These results recommend that current practice could be improved by placing more emphasis on reducing false links and less emphasis on increasing match rates. We conclude with constructive suggestions for reducing linking errors and directions for future research.

JEL-codes: J62 N0 (search for similar items in EconPapers)
Date: 2017-11
New Economics Papers: this item is included in nep-big and nep-pay
Note: AG DAE LS
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (32)

Published as Martha J. Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2020. "How Well Do Automated Linking Methods Perform? Lessons from US Historical Data," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 997-1044, December.

Downloads: (external link)
http://www.nber.org/papers/w24019.pdf (application/pdf)

Related works:
Journal Article: How Well Do Automated Linking Methods Perform? Lessons from US Historical Data (2020) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nbr:nberwo:24019

Ordering information: This working paper can be ordered from
http://www.nber.org/papers/w24019

Access Statistics for this paper

More papers in NBER Working Papers from National Bureau of Economic Research, Inc National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.. Contact information at EDIRC.
Bibliographic data for series maintained by ().

 
Page updated 2025-03-19
Handle: RePEc:nbr:nberwo:24019