A document similarity approach using grammatical linkages with graph databases
V. Priya and
K. Umamaheswari
International Journal of Enterprise Network Management, 2019, vol. 10, issue 3/4, 211-223
Abstract:
Document similarity had become essential in many applications such as document retrieval, recommendation systems, plagiarism checker, etc. Many similarity evaluation approaches rely on word-based document representation, because it is very fast. But these approaches are not accurate when documents with different language and vocabulary are used. When graph representation is used for documents they use some relational knowledge which is not feasible in many applications because of expensive graph operations. In this work a novel approach for document similarity computation which utilises verbal intent has been developed. This improves the similarity by increasing the number of linkages using verbs between two documents. Graph databases were used for faster performance. The performance of the system is evaluated using various metrics like cosine similarity, jaccard similarity and dice with different review datasets. The verbal intent-based approach has registered promising results based on the links between two documents.
Keywords: graph databases; text similarity; grammatical linkages; verbal intent modelling; knowledge graphs. (search for similar items in EconPapers)
Date: 2019
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=103143 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijenma:v:10:y:2019:i:3/4:p:211-223
Access Statistics for this article
More articles in International Journal of Enterprise Network Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().