Evaluating Data Fusion Methods to Improve Income Modelling
Ralf MÃ¼nnich and
No 2022-03, Research Papers in Economics from University of Trier, Department of Economics
Income is an important economic indicator to measure living standards and individual well-being. In Germany, there exist different data sources that yield ambiguous evidence when analysing the income distribution. The Tax Statistics (TS) â€“ an income register recording the total population of more than 40 million taxpayers in Germany for the year 2014 âˆ’ contains the most reliable income information covering the full income distribution. However, it offers only a limited range of socio-demographic variables essential for income analysis. We tackle this challenge by enriching the tax data with information on education and working time from the Microcensus. For that purpose, we ex- amine two types of data fusion methods that seem suited for the specific data fusion scenario of the Tax Statistics and the Microcensus: Missing-data methods on the one hand and performant prediction models on the other hand. We conduct a simulation study and provide an empirical application comparing the proposed data fusion methods, and our results indicate that Multinomial Regression and Random Forest are the most suitable methods for our data fusion scenario.
Keywords: Statistical Matching; Multi-source Estimation; Missing Data; Income Analysis; Statistical Learning (search for similar items in EconPapers)
Pages: 30 pages
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
http://www.uni-trier.de/fileadmin/fb4/prof/VWL/EWF/Research_Papers/2022-03.pdf First version, 2022 (application/pdf)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:trr:wpaper:202203
Access Statistics for this paper
More papers in Research Papers in Economics from University of Trier, Department of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Matthias Neuenkirch ().