EconPapers    
Economics at your fingertips  
 

Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

John Abowd (), Joelle Hillary Abramowitz (), Margaret Levenstein, Kristin McCue, Dhiren Patki, Trivellore Raghunathan (), Ann Michelle Rodgers (), Matthew Shapiro, Nada Wasi and Dawn Zinsser ()
Additional contact information
Trivellore Raghunathan: https://sph.umich.edu/faculty-profiles/raghunathan-trivellore.html
Ann Michelle Rodgers: https://mcommunity.umich.edu/person/anrodger

No 22-11, Working Papers from Federal Reserve Bank of Boston

Abstract: This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents’ workplace characteristics.

Keywords: administrative data; machine learning; multiple imputation; probabilistic record linkage; survey data (search for similar items in EconPapers)
JEL-codes: C13 C18 C81 (search for similar items in EconPapers)
Pages: 36
Date: 2021-10-01
New Economics Papers: this item is included in nep-big and nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.bostonfed.org/publications/research-de ... ing-machine-learning Summary (text/html)
https://www.bostonfed.org/-/media/Documents/Workingpapers/PDF/2022/wp2211.pdf Full text (application/pdf)

Related works:
Working Paper: Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning (2021) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:fip:fedbwp:94891

Ordering information: This working paper can be ordered from

DOI: 10.29412/res.wp.2022.11

Access Statistics for this paper

More papers in Working Papers from Federal Reserve Bank of Boston Contact information at EDIRC.
Bibliographic data for series maintained by Catherine Spozio ().

 
Page updated 2025-03-30
Handle: RePEc:fip:fedbwp:94891