EconPapers    
Economics at your fingertips  
 

Efficient estimation of the number of clusters for high-dimension data

Spiridon Kasapis, Geng Zhang, Jonathon M Smereka and Nickolas Vlahopoulos

The Journal of Defense Modeling and Simulation, 2025, vol. 22, issue 4, 429-441

Abstract: The exponential growth of digital image data has given rise to the need of efficient content management and retrieval tools. Currently, there is a lack of tools for processing the collected unlabeled data in a schematic manner. K -means is one of the most widely used clustering methods and has been applied in a variety of fields, one of them being image sorting. Although a useful tool for image management, the K -means method is heavily influenced by initializations, the most important one being the need to know the number of clusters a priori. A number of different methods have been proposed for identifying the correct number of clusters for K -means, one of them being the variance ratio criterion (VRC). Despite its popularity, the VRC method comes with two very important shortcomings: it only yields good results when the data dimensionality is low and it does not scale well for a high number of clusters, making it very difficult to use in computer vision applications. We propose an extension to the VRC method that works for increased cluster number and high-dimensionality data sets and therefore is fit for image data sets.

Keywords: Clustering; K-means; number of clusters; initializations; unsupervised learning schema; computer vision; variance ratio criterion (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/15485129231214569 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sae:joudef:v:22:y:2025:i:4:p:429-441

DOI: 10.1177/15485129231214569

Access Statistics for this article

More articles in The Journal of Defense Modeling and Simulation
Bibliographic data for series maintained by SAGE Publications ().

 
Page updated 2025-10-18
Handle: RePEc:sae:joudef:v:22:y:2025:i:4:p:429-441