Bootstrapping for Significance of Compact Clusters in Multidimensional Datasets
Ranjan Maitra,
Volodymyr Melnykov and
Soumendra N. Lahiri
Journal of the American Statistical Association, 2012, vol. 107, issue 497, 378-392
Abstract:
This article proposes a bootstrap approach for assessing significance in the clustering of multidimensional datasets. The procedure compares two models and declares the more complicated model a better candidate if there is significant evidence in its favor. The performance of the procedure is illustrated on two well-known classification datasets and comprehensively evaluated in terms of its ability to estimate the number of components via extensive simulation studies, with excellent results. The methodology is also applied to the problem of k -means color quantization of several standard images in the literature and is demonstrated to be a viable approach for determining the minimal and optimal numbers of colors needed to display an image without significant loss in resolution. Additional illustrations and performance evaluations are provided in the online supplementary material.
Date: 2012
References: View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2011.646935 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:107:y:2012:i:497:p:378-392
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20
DOI: 10.1080/01621459.2011.646935
Access Statistics for this article
Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson
More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().