LLM-Based Measurement of Latent Attributes in Trade Data
Matthew DiGiuseppe,
Xuelong Fu and
Michael E Flynn
Additional contact information
Matthew DiGiuseppe: Leiden University
Michael E Flynn: Kansas State University
No t8wdg_v1, SocArXiv from Center for Open Science
Abstract:
Trade data are available at a high level of disaggregation, allowing scholars to examine flows of highly specific goods. Yet the sheer number of goods classifications (5,000+) makes it difficult to analyze trade flows and tariff policy at a mid-level of aggregation beyond a few existing categorizations. Here, we outline a method that can scale---not merely classify---traded goods on researcher-defined dimensions that are orthogonal to existing classification schemes. We propose that the embedded knowledge in large language models (LLMs) can be used to conduct pairwise comparisons (PWCs) of Harmonized System (HS) product descriptions by determining their relative proximity to a specific concept. A Bayesian Bradley--Terry model then uses these PWCs to place individual items on a latent scale of interest. These estimates and their associated uncertainty can then be used for downstream descriptive or causal analysis.
Date: 2026-03-27
New Economics Papers: this item is included in nep-ain, nep-big, nep-cmp and nep-int
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://osf.io/download/69c5385fa382e3ec23254897/
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:t8wdg_v1
DOI: 10.31219/osf.io/t8wdg_v1
Access Statistics for this paper
More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().