EconPapers    
Economics at your fingertips  
 

On the number of groups in clustering

Aurélie Fischer

Statistics & Probability Letters, 2011, vol. 81, issue 12, 1771-1781

Abstract: Clustering is the problem of partitioning data into a finite number k of homogeneous and separate groups, called clusters. A good choice of k is essential for building meaningful clusters. In this paper, this task is addressed from the point of view of model selection via penalization. We design an appropriate penalty shape and derive an associated oracle-type inequality. The method is illustrated on both simulated and real-life data sets.

Keywords: k-means clustering; Number of clusters; Model selection; Oracle inequality; Slope heuristics (search for similar items in EconPapers)
Date: 2011
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167715211002367
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:stapro:v:81:y:2011:i:12:p:1771-1781

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

DOI: 10.1016/j.spl.2011.07.005

Access Statistics for this article

Statistics & Probability Letters is currently edited by Somnath Datta and Hira L. Koul

More articles in Statistics & Probability Letters from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:stapro:v:81:y:2011:i:12:p:1771-1781