Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters
Wan Maseri Binti Wan Mohd,
A.H. Beg,
Tutut Herawan,
A. Noraziah and
K. F. Rabbi
Additional contact information
Wan Maseri Binti Wan Mohd: University Malaysia Pahang, Malaysia
A.H. Beg: University Malaysia Pahang, Malaysia
Tutut Herawan: University Malaysia Pahang, Malaysia
A. Noraziah: University Malaysia Pahang, Malaysia
K. F. Rabbi: University Malaysia Pahang, Malaysia
International Journal of Information Retrieval Research (IJIRR), 2011, vol. 1, issue 3, 1-14
Abstract:
K-means is an unsupervised learning and partitioning clustering algorithm. It is popular and widely used for its simplicity and fastness. K-means clustering produce a number of separate flat (non-hierarchical) clusters and suitable for generating globular clusters. The main drawback of the k-means algorithm is that the user must specify the number of clusters in advance. This paper presents an improved version of K-means algorithm with auto-generate an initial number of clusters (k) and a new approach of defining initial Centroid for effective and efficient clustering process. The underlined mechanism has been analyzed and experimented. The experimental results show that the number of iteration is reduced to 50% and the run time is lower and constantly based on maximum distance of data points, regardless of how many data points.
Date: 2011
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/ijirr.2011070101 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:1:y:2011:i:3:p:1-14
Access Statistics for this article
International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu
More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().