EconPapers    
Economics at your fingertips  
 

Mining Chinese Historical Sources At Scale: A Machine Learning-Approach to Qing State Capacity

Wolfgang Keller, Carol Shiue and Sen Yan

No 19517, CEPR Discussion Papers from Centre for Economic Policy Research

Abstract: Primary historical sources are often by-passed for secondary sources due to high human costs of accessing and extracting primary information–especially in lower-resource settings. We propose a supervised machine-learning approach to the natural language processing of Chinese historical data. An application to identifying different forms of social unrest in the Veritable Records of the Qing Dynasty shows that approach cuts dramatically down the cost of using primary source data at the same time when it is free from human bias, reproducible, and flexible enough to address particular questions. External evidence on triggers of unrest also suggests that the computer-based approach is no less successful in identifying social unrest than human researchers are.

Keywords: Natural; language; processing (search for similar items in EconPapers)
JEL-codes: C8 N45 (search for similar items in EconPapers)
Date: 2024-09
References: Add references at CitEc
Citations:

Downloads: (external link)
https://cepr.org/publications/DP19517 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cpr:ceprdp:19517

Ordering information: This working paper can be ordered from
https://cepr.org/publications/DP19517

Access Statistics for this paper

More papers in CEPR Discussion Papers from Centre for Economic Policy Research 33 Great Sutton Street, London EC1V 0DX, UK.
Bibliographic data for series maintained by CEPR ().

 
Page updated 2026-05-29
Handle: RePEc:cpr:ceprdp:19517