EconPapers    
Economics at your fingertips  
 

Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables

Veronica Backer-Peral (), Vitaly Meursault and Christopher Severen
Additional contact information
Vitaly Meursault: https://www.philadelphiafed.org/our-people/meursault-vitaly

No 25-28, Working Papers from Federal Reserve Bank of Philadelphia

Abstract: Multimodal LLMs offer a watershed change for the digitization of historical tables, enabling low-cost processing centered on domain expertise rather than technical skills. We rigorously validate an LLM-based pipeline on a new panel of historical county-level vehicle registrations. This pipeline is estimated to be 100 times less expensive than outsourcing options, reduces critical parsing errors from 40% to 0.3%, and matches human-validated gold standard data with an R2 of 98.6%. Analyses of growth and persistence in vehicle adoption are statistically indistinguishable whether using LLM or gold standard data. LLM-based digitization unlocks complex historical tables, enabling new economic analyses and broader researcher participation.

Keywords: OCR; Layout Parsing; Entity Linking; Multimodal LLM; Vehicle Adoption (search for similar items in EconPapers)
JEL-codes: C80 N32 N72 R40 (search for similar items in EconPapers)
Pages: 33
Date: 2025-09-30
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.philadelphiafed.org/-/media/FRBP/Asset ... ers/2025/wp25-28.pdf (application/pdf)

Related works:
Working Paper: Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables? (2025) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:fip:fedpwp:101850

Ordering information: This working paper can be ordered from

DOI: 10.21799/frbp.wp.2025.28

Access Statistics for this paper

More papers in Working Papers from Federal Reserve Bank of Philadelphia Contact information at EDIRC.
Bibliographic data for series maintained by Beth Paul ().

 
Page updated 2025-10-02
Handle: RePEc:fip:fedpwp:101850