Embedding Regression: Models for Context-Specific Description and Inference
Pedro L. Rodriguez,
Arthur Spirling and
Brandon M. Stewart
American Political Science Review, 2023, vol. 117, issue 4, 1255-1274
Abstract:
Social scientists commonly seek to make statements about how word use varies over circumstances—including time, partisan identity, or some other document-level covariate. For example, researchers might wish to know how Republicans and Democrats diverge in their understanding of the term “immigration.” Building on the success of pretrained language models, we introduce the à la carte on text (conText) embedding regression model for this purpose. This fast and simple method produces valid vector representations of how words are used—and thus what words “mean”—in different contexts. We show that it outperforms slower, more complicated alternatives and works well even with very few documents. The model also allows for hypothesis testing and statements about statistical significance. We demonstrate that it can be used for a broad range of important tasks, including understanding US polarization, historical legislative development, and sentiment detection. We provide open-source software for fitting the model.
Date: 2023
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.cambridge.org/core/product/identifier/ ... type/journal_article link to article abstract page (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cup:apsrev:v:117:y:2023:i:4:p:1255-1274_8
Access Statistics for this article
More articles in American Political Science Review from Cambridge University Press Cambridge University Press, UPH, Shaftesbury Road, Cambridge CB2 8BS UK.
Bibliographic data for series maintained by Kirk Stebbing ().