Machine Learning Based Linkage of Company Data for Economic Research: Application to the EBDC Business Panels
Valentin Reich
No 409, ifo Working Paper Series from ifo Institute - Leibniz Institute for Economic Research at the University of Munich
Abstract:
This article presents a comprehensive approach to probabilistic linkage of German com pany data using Machine Learning and Natural Language Processing techniques. Here, the long-running ifo Institute surveys are linked to fnancial information in the Orbis database by addressing the unique challenges of company data linkage, such as corporate structures and linguistic nuances in company names. Compared to a previous linkage, the approach achieves improved match rates and is able to re-evaluate existing matches. This article contributes best practice advice for company data linkage and serves as a documentation for the resulting research dataset.
Keywords: record linkage; company data; orbis; survey data (search for similar items in EconPapers)
JEL-codes: C81 C88 (search for similar items in EconPapers)
Date: 2024
New Economics Papers: this item is included in nep-big
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.ifo.de/DocDL/wp-2024-409_reich_linkage-of-company-data.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ces:ifowps:_409
Access Statistics for this paper
More papers in ifo Working Paper Series from ifo Institute - Leibniz Institute for Economic Research at the University of Munich Contact information at EDIRC.
Bibliographic data for series maintained by Klaus Wohlrabe ().