A New Method to Determine Cluster Number Without Clustering for Every K Based on Ratio of Variance to Range in K-Means
Yong Ae Ri,
Chol Ryong Kang,
Kuk Hyon Kim,
Yong Myong Choe,
Un Chol Han and
Weifeng Pan
Mathematical Problems in Engineering, 2022, vol. 2022, 1-14
Abstract:
In many clustering algorithms such as K-means and FCM, the cluster number K needs to be known beforehand. In this paper, we propose a new method to determine the cluster number without clustering for every K in K-means. We introduce a new statistics RVR (ratio of variance to range) and conduct Monte Carlo analysis of its characteristics. Based on the RVR, we propose an algorithm to determine the cluster number K and perform clustering utilizing it. We evaluate its effectiveness by performing a simulation test with different types of datasets; first, with real datasets, whose real number of clusters and components are known and second, with synthetic datasets. We observe a significant improvement in speed and quality of determining the cluster number and therefore clustering. Finally, we hope the proposed algorithm to be used efficiently and widely for clustering of multidimensional data.
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/mpe/2022/6866747.pdf (application/pdf)
http://downloads.hindawi.com/journals/mpe/2022/6866747.xml (application/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:jnlmpe:6866747
DOI: 10.1155/2022/6866747
Access Statistics for this article
More articles in Mathematical Problems in Engineering from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().