Paired completion: quantifying issue-framing at scale with LLMs
Simon D Angus () and
Lachlan O'Neill ()
Additional contact information
Simon D Angus: SoDa Laboratories & Dept. of Economics, Monash Business School
Lachlan O'Neill: SoDa Laboratories, Monash Business School
No 2024-02, SoDa Laboratories Working Paper Series from Monash University, SoDa Laboratories
Abstract:
Detecting and quantifying issue framing in textual discourse - the slant or perspective one takes to a given topic (e.g. climate science vs. denialism, misogyny vs. gender equality) - is highly valuable to a range of end-users from social and political scientists to program evaluators and policy analysts. Being able to identify statistically significant shifts, reversals, or changes in issue framing in public discourse would enable the quantitative evaluation of interventions, actors and events that shape discourse. However, issue framing is notoriously challenging for automated natural language processing (NLP) methods since the words and phrases used by either 'side' of an issue are often held in common, with only subtle stylistic flourishes separating their use. Here we develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets. By introducing a novel application of next-token log probabilities derived from generative large language models (LLMs) we show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue, a method we call 'paired completion'. Through 192 independent experiments over three novel, synthetic datasets, we evaluate paired completion against prompt-based LLM methods and labelled methods using traditional NLP and recent LLM contextual embeddings. We additionally conduct a cost-based analysis to mark out the feasible set of performant methods at production-level scales, and a model bias analysis. Together, our work demonstrates a feasible path to scalable, accurate and low-bias issue-framing in large corpora.
Keywords: slant detection; text-as-data; synthetic data; computational linguistics (search for similar items in EconPapers)
JEL-codes: C19 C55 (search for similar items in EconPapers)
Date: 2024-06
New Economics Papers: this item is included in nep-ain, nep-big and nep-cmp
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://soda-wps.s3-website-ap-southeast-2.amazonaw ... r/sodwps/2024-02.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ajr:sodwps:2024-02
Ordering information: This working paper can be ordered from
https://www.monash.edu/business/soda-labs/home
Access Statistics for this paper
More papers in SoDa Laboratories Working Paper Series from Monash University, SoDa Laboratories SoDa Laboratories, Monash University, Victoria 3800, Australia. Contact information at EDIRC.
Bibliographic data for series maintained by Ashani Amarasinghe ( this e-mail address is bad, please contact ).