EconPapers    
Economics at your fingertips  
 

Estimating wage disparities using foundation models

Keyon Vafa, Susan Athey () and David M. Blei
Additional contact information
Keyon Vafa: a Harvard Data Science Initiative , Harvard University , Cambridge , MA 02138
Susan Athey: c Stanford Institute for Human-Centered Artificial Intelligence , Stanford University , Stanford , CA 94305
David M. Blei: e Department of Statistics , Columbia University , New York , NY 10027

Proceedings of the National Academy of Sciences, 2025, vol. 122, issue 22, e2427298122

Abstract:

The rise of foundation models marks a paradigm shift in machine learning: instead of training specialized models from scratch, foundation models are trained on massive datasets before being adjusted or fine-tuned to make predictions on smaller datasets. Initially developed for text, foundation models have also excelled at making predictions about social science data. However, while many estimation problems in the social sciences use prediction as an intermediate step, they ultimately require different criteria for success. In this paper, we develop methods for fine-tuning foundation models to perform these estimation problems. We first characterize an omitted variable bias that can arise when a foundation model is fine-tuned in the standard way: to minimize predictive error. We then provide a set of conditions for fine-tuning under which estimates derived from a foundation model are n -consistent. Based on this theory, we develop fine-tuning algorithms that empirically mitigate this omitted variable bias. To demonstrate our ideas, we study gender wage gap estimation. Classical methods for estimating the adjusted wage gap employ simple predictive models of wages, which can induce omitted variable bias because they condition on coarse summaries of career history. Instead, we use a custom-built foundation model, capturing a richer representation of career history. Using data from the Panel Study of Income Dynamics, we find that career history explains more of the gender wage gap than standard econometric models can measure, and we identify elements of career history that are omitted by standard models but are important for explaining the gap.

Keywords: machine learning; foundation models; labor economics; econometrics (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.1073/pnas.2427298122 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nas:journl:v:122:y:2025:p:e2427298122

Access Statistics for this article

More articles in Proceedings of the National Academy of Sciences from Proceedings of the National Academy of Sciences
Bibliographic data for series maintained by PNAS Product Team ().

 
Page updated 2025-06-28
Handle: RePEc:nas:journl:v:122:y:2025:p:e2427298122