EconPapers    
Economics at your fingertips  
 

Addressing Sample Selection Bias for Machine Learning Methods

Dylan Brewer and Alyssa Carlson

No 2310, Working Papers from Department of Economics, University of Missouri

Abstract: We study approaches for adjusting machine learning methods when the training sample differs from the prediction sample on unobserved dimensions. The machine learning literature predominately assumes selection only on observed dimensions. Common approaches are to weight or include variables that influence selection as solutions to selection on observables. Simulation results show that selection on unobservables increases mean squared prediction error using popular machine-learning algorithms. Common machine learning practices such as weighting or including variables that influence selection into the training or prediction sample often worsens sample selection bias. We propose two control-function approaches that remove the effects of selection bias before training and find that they reduce mean-squared prediction error in simulations. We apply these approaches to predicting the vote share of the incumbent in gubernatorial elections using previously observed re-election bids. We find that ignoring selection on unobservables leads to substantially higher predicted vote shares for the incumbent than when the control function approach is used.

Keywords: sample selection; machine learning; control function; inverse probability weighting (search for similar items in EconPapers)
JEL-codes: C13 C31 C55 D72 (search for similar items in EconPapers)
Date: 2023-06
New Economics Papers: this item is included in nep-ain, nep-big and nep-cmp
References: Add references at CitEc
Citations:

Downloads: (external link)
https://drive.google.com/file/d/1n8EZlC89OnB6BC8AE ... LqS/view?usp=sharing (application/pdf)

Related works:
Journal Article: Addressing sample selection bias for machine learning methods (2024) Downloads
Working Paper: Addressing Sample Selection Bias for Machine Learning Methods (2023) Downloads
Working Paper: Addressing Sample Selection Bias for Machine Learning Methods (2021) Downloads
Working Paper: Addressing Sample Selection Bias for Machine Learning Methods (2021) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:umc:wpaper:2310

Access Statistics for this paper

More papers in Working Papers from Department of Economics, University of Missouri Contact information at EDIRC.
Bibliographic data for series maintained by Chao Gu ().

 
Page updated 2025-03-29
Handle: RePEc:umc:wpaper:2310