Clustering Validation Inference
Pau Figuera (),
Alfredo Cuzzocrea and
Pablo García Bringas
Additional contact information
Pau Figuera: Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain
Alfredo Cuzzocrea: iDEA Lab, University of Calabria, 87036 Rende, Italy
Pablo García Bringas: Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain
Mathematics, 2024, vol. 12, issue 15, 1-31
Abstract:
Clustering validation is applied to evaluate the quality of classifications. This step is crucial for unsupervised machine learning. A plethora of methods exist for this purpose; however, a common drawback is that statistical inference is not possible. In this study, we construct a density function for the cluster number. For this purpose, we use smooth techniques. Then, we apply non-negative matrix factorization using the Kullback–Leibler divergence. Employing a unique linearly independent uncorrelated observational variable hypothesis, we construct a sequence by varying the dimension of the span space of the factorization only using analytical techniques. The expectation of the limit of this sequence follows a gamma probability density function. Then, identifying the dimension of the factorization of the space span with clusters, we transform the estimation of the suitable dimension of the factorization into a probabilistic estimate of the number of clusters. This approach is an internal validation method that is suitable for numerical and categorical multivariate data and independent of the clustering technique. Our main achievement is a predictive clustering validation model with graphical abilities. It provides results in terms of credibility, thus making it possible to compare results such as expert judgment on a quantitative basis.
Keywords: non-negative matrix factorization; trace sequence limit; clustering validation; inferential clustering validation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/15/2349/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/15/2349/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:15:p:2349-:d:1444180
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().