EconPapers    
Economics at your fingertips  
 

Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks

Camilo Akimushkin, Diego Raphael Amancio and Osvaldo Novais Oliveira

PLOS ONE, 2017, vol. 12, issue 1, 1-15

Abstract: Automatic identification of authorship in disputed documents has benefited from complex network theory as this approach does not require human expertise or detailed semantic knowledge. Networks modeling entire books can be used to discriminate texts from different sources and understand network growth mechanisms, but only a few studies have probed the suitability of networks in modeling small chunks of text to grasp stylistic features. In this study, we introduce a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors. The texts were divided into sections with equal number of linguistic tokens, from which time series were created for 12 topological metrics. Since 73% of all series were stationary (ARIMA(p, 0, q)) and the remaining were integrable of first order (ARIMA(p, 1, q)), probability distributions could be obtained for the global network metrics. The metrics exhibit bell-shaped non-Gaussian distributions, and therefore distribution moments were used as learning attributes. With an optimized supervised learning procedure based on a nonlinear transformation performed by Isomap, 71 out of 80 texts were correctly classified using the K-nearest neighbors algorithm, i.e. a remarkable 88.75% author matching success rate was achieved. Hence, purely dynamic fluctuations in network metrics can characterize authorship, thus paving the way for a robust description of large texts in terms of small evolving networks.

Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (7)

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0170527 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 70527&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0170527

DOI: 10.1371/journal.pone.0170527

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone (plosone@plos.org).

 
Page updated 2025-03-19
Handle: RePEc:plo:pone00:0170527