EconPapers    
Economics at your fingertips  
 

Digitization and data frames for card index records

Someswar Amujala, Angela Vossmeyer and Sanjiv Das ()

Explorations in Economic History, 2023, vol. 87, issue C

Abstract: We develop a methodology for converting card index archival records into usable data frames for statistical and textual analyses. Leveraging machine learning and natural-language processing tools from Amazon Web Services (AWS), we overcome hurdles associated with character recognition, inconsistent data reporting, column misalignment, and irregular naming. In this article, we detail the step-by-step conversion process and discuss remedies for common problems and edge cases, using historical records from the Reconstruction Finance Corporation.

Keywords: Machine learning; Natural-language processing; Archival records; Unstructured data (search for similar items in EconPapers)
JEL-codes: C8 N4 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S001449832200047X
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:exehis:v:87:y:2023:i:c:s001449832200047x

DOI: 10.1016/j.eeh.2022.101469

Access Statistics for this article

Explorations in Economic History is currently edited by R.H. Steckel

More articles in Explorations in Economic History from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:exehis:v:87:y:2023:i:c:s001449832200047x