LABOR-LLM: Language-Based Occupational Representations with Large Language Models
Susan Athey,
Herman Brunborg,
Tianyu Du,
Ayush Kanodia and
Keyon Vafa
Papers from arXiv.org
Abstract:
Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker's next job as a function of career history (an "occupation model"). CAREER was initially estimated ("pre-trained") using a large, unrepresentative resume dataset, which served as a "foundation model," and parameter estimation was continued ("fine-tuned") using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.
Date: 2024-06, Revised 2025-02
New Economics Papers: this item is included in nep-ain, nep-big, nep-cmp and nep-mac
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2406.17972 Latest version (application/pdf)
Related works:
Working Paper: Labor-LLM: Language-Based Occupational Representations with Large Language Models (2024) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2406.17972
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().