EconPapers    
Economics at your fingertips  
 

Latent Dirichlet Allocation and POS Tags Based Method for External Plagiarism Detection: LDA and POS Tags Based Plagiarism Detection

Ali Daud, Jamal Ahmad Khan, Jamal Abdul Nasir, Rabeeh Ayaz Abbasi, Naif Radi Aljohani and Jalal S. Alowibdi
Additional contact information
Ali Daud: King Abdulaziz University, Jeddah, Saudi Arabia & International Islamic University, Islamabad, Pakistan
Jamal Ahmad Khan: International Islamic University, Islamabad, Pakistan
Jamal Abdul Nasir: International Islamic University, Islamabad, Pakistan
Rabeeh Ayaz Abbasi: King Abdulaziz University, Jeddah, Saudi Arabia & Quaid-i-Azam University, Islamabad, Pakistan
Naif Radi Aljohani: Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Jalal S. Alowibdi: Faculty of Computing and Information Technology, University of Jeddah, Jeddah, Saudi Arabia

International Journal on Semantic Web and Information Systems (IJSWIS), 2018, vol. 14, issue 3, 53-69

Abstract: In this article we present a new semantic and syntactic-based method for external plagiarism detection. In the proposed approach, latent dirichlet allocation (LDA) and parts of speech (POS) tags are used together to detect plagiarism between the sample and a number of source documents. The basic hypothesis is that considering semantic and syntactic information between two text documents may improve the performance of the plagiarism detection task. Our method is based on two steps, naming, which is a pre-processing where we detect the topics from the sentences in documents using the LDA and convert each sentence in POS tags array; then a post processing step where the suspicious cases are verified purely on the basis of semantic rules. For two types of external plagiarism (copy and random obfuscation), we empirically compare our approach to the state-of-the-art N-gram based and stop-word N-gram based methods and observe significant improvements.

Date: 2018
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://services.igi-global.com/resolvedoi/resolve ... 18/IJSWIS.2018070103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jswis0:v:14:y:2018:i:3:p:53-69

Access Statistics for this article

International Journal on Semantic Web and Information Systems (IJSWIS) is currently edited by Brij Gupta

More articles in International Journal on Semantic Web and Information Systems (IJSWIS) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-05-25
Handle: RePEc:igg:jswis0:v:14:y:2018:i:3:p:53-69