Playing with Matches: An Assessment of Accuracy in Linked Historical Data
Catherine G. Massey
CARRA Working Papers from Center for Economic Studies, U.S. Census Bureau
Abstract:
This paper evaluates linkage quality achieved by various record linkage techniques used in historical demography. I create benchmark, or truth, data by linking the 2005 Current Population Survey Annual Social and Economic Supplement to the Social Security Administration’s Numeric Identification System by Social Security Number. By comparing simulated linkages to the benchmark data, I examine the value added (in terms of number and quality of links) from incorporating text-string comparators, adjusting age, and using a probabilistic matching algorithm. I find that text-string comparators and probabilistic approaches are useful for increasing the linkage rate, but use of text-string comparators may decrease accuracy in some cases. Overall, probabilistic matching offers the best balance between linkage rates and accuracy.
Keywords: historical demography; matches; record linkage (search for similar items in EconPapers)
Date: 2016-06
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.census.gov/content/dam/Census/library/ ... carra-wp-2016-05.pdf First version, 2016 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cen:cpaper:2016-05
Access Statistics for this paper
More papers in CARRA Working Papers from Center for Economic Studies, U.S. Census Bureau Contact information at EDIRC.
Bibliographic data for series maintained by Dawn Anderson ().