EconPapers    
Economics at your fingertips  
 

The Challenge of Pairing Big Datasets: Probabilistic Record Linkage Methods and Diagnosis of Their Empirical Viability

Yaohao Peng () and Lucas Ferreira Mation ()
Additional contact information
Yaohao Peng: Brazilian Secretariat for Economic Policy
Lucas Ferreira Mation: Brazilian Institute of Applied Economic Research

Journal of Business Cycle Research, 2020, vol. 16, issue 1, No 3, 35-57

Abstract: Abstract In this paper, we evaluated the predictive performance of probabilistic record linkage algorithms, discussing the implications of different configurations of blocking keys, string similarity functions and phonetic code on the prediction’s overall performance and computational complexity. Furthermore, we carried out a bibliographical survey of the main deterministic and probabilistic record linkage methods, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at the national level and have potential value for the formulation and evaluation of public policies.

Keywords: Record linkage; Blocking; Administrative records; Big data; R (search for similar items in EconPapers)
JEL-codes: C52 C55 C65 C80 C88 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s41549-020-00043-1 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jbuscr:v:16:y:2020:i:1:d:10.1007_s41549-020-00043-1

Ordering information: This journal article can be ordered from
http://www.springer. ... nomics/journal/41549

DOI: 10.1007/s41549-020-00043-1

Access Statistics for this article

Journal of Business Cycle Research is currently edited by Michael Graff

More articles in Journal of Business Cycle Research from Springer, Centre for International Research on Economic Tendency Surveys (CIRET)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jbuscr:v:16:y:2020:i:1:d:10.1007_s41549-020-00043-1