Economics at your fingertips  

Evaluating Data Fusion Methods to Improve Income Modelling

Jana Emmenegger, Ralf Münnich and Jannik Schaller

No 2022-03, Research Papers in Economics from University of Trier, Department of Economics

Abstract: Income is an important economic indicator to measure living standards and individual well-being. In Germany, there exist different data sources that yield ambiguous evidence when analysing the income distribution. The Tax Statistics (TS) – an income register recording the total population of more than 40 million taxpayers in Germany for the year 2014 − contains the most reliable income information covering the full income distribution. However, it offers only a limited range of socio-demographic variables essential for income analysis. We tackle this challenge by enriching the tax data with information on education and working time from the Microcensus. For that purpose, we ex- amine two types of data fusion methods that seem suited for the specific data fusion scenario of the Tax Statistics and the Microcensus: Missing-data methods on the one hand and performant prediction models on the other hand. We conduct a simulation study and provide an empirical application comparing the proposed data fusion methods, and our results indicate that Multinomial Regression and Random Forest are the most suitable methods for our data fusion scenario.

Keywords: Statistical Matching; Multi-source Estimation; Missing Data; Income Analysis; Statistical Learning (search for similar items in EconPapers)
Pages: 30 pages
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link) First version, 2022 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Access Statistics for this paper

More papers in Research Papers in Economics from University of Trier, Department of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Matthias Neuenkirch ().

Page updated 2022-10-05
Handle: RePEc:trr:wpaper:202203