EconPapers    
Economics at your fingertips  
 

Authorship Attribution of Noisy Text Data With a Comparative Study of Clustering Methods

Zohra Hamadache and Halim Sayoud
Additional contact information
Zohra Hamadache: USTHB University, Bab Ezzouar, Algeria
Halim Sayoud: USTHB University, Bab Ezzouar, Algeria

International Journal of Knowledge and Systems Science (IJKSS), 2018, vol. 9, issue 2, 45-69

Abstract: Through the fast development and intensification of the large volume of data via the internet, visual analytics (VA) comes out with the intention of visualizing multidimensional data in different ways, which reveals interesting information about the data, making them clearer and more intelligible. In this investigation, the authors focused on the VA based Authorship Attribution (AA) task, applied on noisy text data. Furthermore, this article proposes 3D Visual Analytics technique based on sphere implementation. The used dataset contains several text documents written by 5 American Philosophers, with an average length of 850 words per text, which were scanned and then corrupted with different noise levels. The obtained results show that the hierarchical clustering technique using a fully-automated threshold, presents high performance in terms of authorship attribution accuracy, especially with character trigrams and ending bigrams, where the clustering recognition rate (CRR) reaches an accuracy of 100% at noise levels: from 0% to 7%. In addition, the proposed 3D sphere technique appears quite interesting by showing high clustering performances, mainly with Words.

Date: 2018
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJKSS.2018040103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jkss00:v:9:y:2018:i:2:p:45-69

Access Statistics for this article

International Journal of Knowledge and Systems Science (IJKSS) is currently edited by Van Nam Huynh

More articles in International Journal of Knowledge and Systems Science (IJKSS) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jkss00:v:9:y:2018:i:2:p:45-69