EconPapers    
Economics at your fingertips  
 

Plagiarism detection based on semantic analysis

Indrajit Mukherjee, Bipul Kumar, Samarth Singh and Kishan Sharma

International Journal of Knowledge and Learning, 2018, vol. 12, issue 3, 242-254

Abstract: Plagiarism means copy and paste for a text or change in some words or make use of synonymous or near synonymous words without citing the source. Plagiarism is on rise especially in the academic and research field due the availability of the digital text documents in the internet which can easily be copied and pasted. Existing approaches for detecting the plagiarism have either ignored or made limited use of information about semantic similarities between the words. We proposed a method to measure the semantic similarity between the documents by mapping keywords (verbs; adverbs; adjectives; descriptors; etc.) with the nouns and then finding the similarity between the mapped words that can rectify the existing shortcomings. The efficiency of the algorithm is evaluated on the dataset (corpus of Plagiarised Short Answers) (Clough and Stevenson, 2011). The experiments showed that the proposed algorithm gives significantly accurate results in detecting semantic based similarity between the documents and found to outperform previously published methods.

Keywords: semantic similarity; plagiarism detection; documents; WordNet. (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=92316 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijklea:v:12:y:2018:i:3:p:242-254

Access Statistics for this article

More articles in International Journal of Knowledge and Learning from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:ijklea:v:12:y:2018:i:3:p:242-254