Efficient data clustering algorithm designed using a heuristic approach
Poonam Nandal,
Deepa Bura and
Meeta Singh
International Journal of Data Analysis Techniques and Strategies, 2021, vol. 13, issue 1/2, 3-14
Abstract:
Information retrieval from a large amount of information available in a database is a major issue these days. The relevant information extraction from the voluminous information available on the web is being done using various techniques like natural language processing, lexical analysis, clustering, categorisation, etc. In this paper, we have discussed the clustering methods used for clustering of large amount of data using different features to classify the data. In today's era, various problem solving techniques makes the use of a heuristic approach for designing and developing various efficient algorithms. In this paper, we have proposed a clustering technique using a heuristic function to select the centroid so that the clusters formed are as per the need of the user. The heuristic function designed in this paper is based on the conceptually similar data points so that they are grouped into accurate clusters. k-means clustering algorithm is majorly used to cluster the data which is also focussed in this paper. It has been empirically found that the clusters formed and the data points which belong to a cluster are close to human analysis as compared to existing clustering algorithms.
Keywords: clustering; natural language processing; k -means; concept; heuristic; Euclidean distance; 2D algorithm; information retrieval; Manhattan distance; density concept. (search for similar items in EconPapers)
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=114666 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:13:y:2021:i:1/2:p:3-14
Access Statistics for this article
More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().