EconPapers    
Economics at your fingertips  
 

Performance of authorship attribution classifiers with short texts: application of religious Arabic fatwas

Mohammed Al-Sarem, Abdel-Hamid Emara and Ahmed Abdel Wahab

International Journal of Data Mining, Modelling and Management, 2020, vol. 12, issue 3, 350-364

Abstract: Although authorship attribution is a well-known problem in authorship analysis domain, researches on Arabic contexts are still limited. In addition, examining the performance of the attribution methods on training set with short textual documents is also not considered well in other languages, such as English, Chinese, Spanish and Dutch. Therefore, this current work aims at examining the performance of attribution classifiers in the context of short Arabic textual documents. The experimental part of this work is conducted with well-known classifiers namely: decision tree C4.5 method, naive Bayes model, K-NN method, Markov model, SMO and Burrows Delta method. We experiment with various features combination. The results show that combining the word-based lexical features with the structural features yields the best accuracy. At this end, we use this combination as a baseline for further investigation. We also examine the effect of combining the n-gram features. The results indicate that some classifiers show an improvement while the others do not. In addition, the results show that the naive Bayes method gives the highest accuracy among all the attribution classifiers.

Keywords: authorship attribution; AA; stylomatric features; SF; attribution classifiers; JGAAP tool; Arabic language. (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=108719 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:12:y:2020:i:3:p:350-364

Access Statistics for this article

More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:ijdmmm:v:12:y:2020:i:3:p:350-364