EconPapers    
Economics at your fingertips  
 

Using Generative Pre-Trained Transformers (GPT) for Supervised Content Encoding: An Application in Corresponding Experiments

Alexander Churchill, Shamitha Pichika and Chengxin Xu
Additional contact information
Chengxin Xu: Seattle University

No 6fpgj, SocArXiv from Center for Open Science

Abstract: Supervised content encoding applies a given codebook to a larger non-numerical dataset and is central to empirical research in public administration. Not only is it a key analytical approach for qualitative studies, but the method also allows researchers to measure constructs using non-numerical data, which can then be applied to quantitative description and causal inference. Despite its utility, supervised content encoding faces challenges including high cost and low reproducibility. In this report, we test if large language models (LLM), specifically generative pre-trained transformers (GPT), can solve these problems. Using email messages collected from a national corresponding experiment in the U.S. nursing home market as an example, we demonstrate that although we found some disparities between GPT and human coding results, the disagreement is acceptable for certain research design, which makes GPT encoding a potential substitute for human encoders. Practical suggestions for encoding with GPT are provided at the end of the letter.

Date: 2024-01-25
New Economics Papers: this item is included in nep-ain, nep-big, nep-cmp and nep-exp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://osf.io/download/65b191e7b1f2b5065eb0eb38/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:6fpgj

DOI: 10.31219/osf.io/6fpgj

Access Statistics for this paper

More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:socarx:6fpgj