EconPapers    
Economics at your fingertips  
 

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure

Marek Gagolewski ()
Additional contact information
Marek Gagolewski: Warsaw University of Technology

Journal of Classification, 2025, vol. 42, issue 1, No 2, 2-30

Abstract: Abstract There is no, nor will there ever be, single best clustering algorithm. Nevertheless, we would still like to be able to distinguish between methods that work well on certain task types and those that systematically underperform. Clustering algorithms are traditionally evaluated using either internal or external validity measures. Internal measures quantify different aspects of the obtained partitions, e.g., the average degree of cluster compactness or point separability. However, their validity is questionable because the clusterings they endorse can sometimes be meaningless. External measures, on the other hand, compare the algorithms’ outputs to fixed ground truth groupings provided by experts. In this paper, we argue that the commonly used classical partition similarity scores, such as the normalised mutual information, Fowlkes–Mallows, or adjusted Rand index, miss some desirable properties. In particular, they do not identify worst-case scenarios correctly, nor are they easily interpretable. As a consequence, the evaluation of clustering algorithms on diverse benchmark datasets can be difficult. To remedy these issues, we propose and analyse a new measure: a version of the optimal set-matching accuracy, which is normalised, monotonic with respect to some similarity relation, scale-invariant, and corrected for the imbalancedness of cluster sizes (but neither symmetric nor adjusted for chance).

Keywords: Clustering; External cluster validity; Optimal set matching; Normalisation; Accuracy; Adjusted Rand index; Mutual information (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00357-024-09482-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jclass:v:42:y:2025:i:1:d:10.1007_s00357-024-09482-2

Ordering information: This journal article can be ordered from
http://www.springer. ... hods/journal/357/PS2

DOI: 10.1007/s00357-024-09482-2

Access Statistics for this article

Journal of Classification is currently edited by Douglas Steinley

More articles in Journal of Classification from Springer, The Classification Society
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-02
Handle: RePEc:spr:jclass:v:42:y:2025:i:1:d:10.1007_s00357-024-09482-2