Geoparsing history: Locating commodities in ten million pages of nineteenth-century sources
Jim Clifford,
Beatrice Alex,
Colin M. Coates,
Ewan Klein and
Andrew Watson
Historical Methods: A Journal of Quantitative and Interdisciplinary History, 2016, vol. 49, issue 3, 115-131
Abstract:
In the Trading Consequences project, historians, computational linguists, and computer scientists collaborated to develop a text mining system that extracts information from a vast amount of digitized published English-language sources from the “long nineteenth century” (1789 to 1914). The project focused on identifying relationships within the texts between commodities, geographical locations, and dates. The authors explain the methodology, uses, and the limitations of applying digital humanities techniques to historical research, and they argue that interdisciplinary approaches are critically important in addressing the technical challenges that arise. Collaborative teamwork of the kind described here has considerable potential to produce further advances in the large-scale analysis of historical documents.
Date: 2016
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/01615440.2015.1116419 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:vhimxx:v:49:y:2016:i:3:p:115-131
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/vhim20
DOI: 10.1080/01615440.2015.1116419
Access Statistics for this article
Historical Methods: A Journal of Quantitative and Interdisciplinary History is currently edited by J. David Hacker and Kenneth Sylvester
More articles in Historical Methods: A Journal of Quantitative and Interdisciplinary History from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().