EconPapers    
Economics at your fingertips  
 

Better than the best? Answers via model ensemble in density-based clustering

Alessandro Casa (), Luca Scrucca and Giovanna Menardi ()
Additional contact information
Alessandro Casa: University of Padova
Giovanna Menardi: University of Padova

Advances in Data Analysis and Classification, 2021, vol. 15, issue 3, No 4, 599-623

Abstract: Abstract With the recent growth in data availability and complexity, and the associated outburst of elaborate modelling approaches, model selection tools have become a lifeline, providing objective criteria to deal with this increasingly challenging landscape. In fact, basing predictions and inference on a single model may be limiting if not harmful; ensemble approaches, which combine different models, have been proposed to overcome the selection step, and proven fruitful especially in the supervised learning framework. Conversely, these approaches have been scantily explored in the unsupervised setting. In this work we focus on the model-based clustering formulation, where a plethora of mixture models, with different number of components and parametrizations, is typically estimated. We propose an ensemble clustering approach that circumvents the single best model paradigm, while improving stability and robustness of the partitions. A new density estimator, being a convex linear combination of the density estimates in the ensemble, is introduced and exploited for group assignment. As opposed to the standard case, where clusters are typically associated to the components of the selected mixture model, we define partitions by borrowing the modal, or nonparametric, formulation of the clustering problem, where groups are linked with high-density regions. Staying in the density-based realm we thus show how blending together parametric and nonparametric approaches may be beneficial from a clustering perspective.

Keywords: Cluster analysis; Model averaging; Ensemble learning; Density-based clustering; Density estimation; 62H30; 62H99 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://link.springer.com/10.1007/s11634-020-00423-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:15:y:2021:i:3:d:10.1007_s11634-020-00423-6

Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2

DOI: 10.1007/s11634-020-00423-6

Access Statistics for this article

Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs

More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-30
Handle: RePEc:spr:advdac:v:15:y:2021:i:3:d:10.1007_s11634-020-00423-6