An efficient context-aware agglomerative fuzzy clustering framework for plagiarism detection

Chakrabarty, Anirban; Roy, Sudipta

An efficient context-aware agglomerative fuzzy clustering framework for plagiarism detection

Anirban Chakrabarty and Sudipta Roy

International Journal of Data Mining, Modelling and Management, 2018, vol. 10, issue 2, 188-208

Abstract: Plagiarism refers to the act of copying content without acknowledging the original source. Though there are several existing commercial tools for plagiarism detection, still plagiarism is tricky and challenging due to the rise in volume of online publications. Existing plagiarism detection methods use paraphrasing, sentence and key-word matching, but such techniques has not been very effective. In this work, a framework for fuzzy based plagiarism detection is proposed using a context-aware agglomerative clustering approach with an improved time complexity. The work aims in retrieving key concepts at word, sentence and paragraph level by integrating semantic features in a novel optimisation function to detect plagiarism effectively. The notion of fuzzy clustering has been applied to improve the robustness and consistency of results for clustering multi-disciplinary papers. The experimental analysis is supported by comparison with other contemporary techniques which indicate the superiority of proposed approach for plagiarism detection.

Keywords: fuzzy clustering; context similarity; plagiarism detection; spanning tree; agglomerative clustering; validity index; constrained objective function. (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=92533 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:10:y:2018:i:2:p:188-208

Access Statistics for this article

More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().