A step towards the integration of machine learning and classic model-based survey methods

\.Z\k{a}d{\l}o, Tomasz; Chwila, Adam

A step towards the integration of machine learning and classic model-based survey methods

Tomasz \.Z\k{a}d{\l}o and Adam Chwila

Abstract: The usage of machine learning methods in traditional surveys including official statistics, is still very limited. Therefore, we propose a predictor supported by these algorithms, which can be used to predict any population or subpopulation characteristics. Machine learning methods have already been shown to be very powerful in identifying and modelling complex and nonlinear relationships between the variables, which means they have very good properties in case of strong departures from the classic assumptions. Therefore, we analyse the performance of our proposal under a different set-up, which, in our opinion, is of greater importance in real-life surveys. We study only small departures from the assumed model to show that our proposal is a good alternative, even in comparison with optimal methods under the model. Moreover, we propose the method of the ex ante accuracy estimation of machine learning predictors, giving the possibility of the accuracy comparison with classic methods. The solution to this problem is indicated in the literature as one of the key issues in integrating these approaches. The simulation studies are based on a real, longitudinal dataset, where the prediction of subpopulation characteristics is considered.

Date: 2024-02, Revised 2025-07
New Economics Papers: this item is included in nep-big, nep-cmp and nep-ure
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2402.07521 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2402.07521

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().