Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK

Richiardi, Matteo; Rejoice, Frimpong

Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK

Matteo Richiardi and Frimpong Rejoice

No CEMPA9/25, Centre for Microsimulation and Policy Analysis Working Paper Series from Centre for Microsimulation and Policy Analysis at the Institute for Social and Economic Research

Abstract: Development of microsimulation models often requires reweighting some input dataset to reflect the characteristics of a different population of interest. In this paper we explore a machine learning approach whereas a variant of decision trees (Gradient Boosted Machine) is used to replicate the joint distribution of target variables observed in a large commercially available but slightly biased dataset, with an additional raking step to remove the bias and ensure consistency of relevant marginal distributions with official statistics. The method is applied to build a regional variant of UKMOD, an open-source static tax-benefit model for the UK belonging to the EUROMOD family, with an application to the Greater Essex region in the UK.

Date: 2025-08-11
New Economics Papers: this item is included in nep-big and nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Published

Downloads: (external link)
https://www.iser.essex.ac.uk/wp-content/uploads/fi ... /cempa/cempa9-25.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ese:cempwp:cempa9-25

Access Statistics for this paper

More papers in Centre for Microsimulation and Policy Analysis Working Paper Series from Centre for Microsimulation and Policy Analysis at the Institute for Social and Economic Research Contact information at EDIRC.
Bibliographic data for series maintained by Jonathan Nears ().