EconPapers    
Economics at your fingertips  
 

Combining AI and Established Methods for Historical Document Analysis

Daniel Moulton, Larry Santucci and Robyn Smith

No 25-02, Consumer Finance Institute discussion papers from Federal Reserve Bank of Philadelphia

Abstract: This paper examines methodological approaches for extracting structured data from large-scale historical document archives, comparing “hyperspecialized” versus “adaptive modular” strategies. Using 56 years of Philadelphia property deeds as a case study, we show the benefits of the adaptive modular approach leveraging optical character recognition (OCR), full-text search, and frontier large language models (LLMs) to identify deeds containing specific restrictive use language— achieving 98% precision and 90–98% recall. Our adaptive modular methodology enables analysis of historically important economic phenomena including re strictive property covenants, their precise geographic locations, and the localized neighborhood effects of these restrictions. This approach should be easily adapt able to other research involving deeds and similar document.

Keywords: large language models (LLMs); artificial intelligence (AI); machine learning (ML); restrictive covenants; deeds; property; real estate; housing; John Coltrane; digitization (search for similar items in EconPapers)
JEL-codes: C81 N32 R31 R38 (search for similar items in EconPapers)
Pages: 30
Date: 2025-10-25
New Economics Papers: this item is included in nep-ain and nep-his
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.philadelphiafed.org/-/media/FRBP/Asset ... n-Papers/dp25-02.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:fip:fedpdp:102114

Ordering information: This working paper can be ordered from

DOI: 10.21799/frbp.dp.2025.02

Access Statistics for this paper

More papers in Consumer Finance Institute discussion papers from Federal Reserve Bank of Philadelphia Contact information at EDIRC.
Bibliographic data for series maintained by Beth Paul ().

 
Page updated 2025-12-06
Handle: RePEc:fip:fedpdp:102114