Multivariate permutation tests for the k-sample problem with clustered data
Jörg Rahnenführer
Additional contact information
Jörg Rahnenführer: Heinrich-Heine-Universität Düsseldorf
Computational Statistics, 2002, vol. 17, issue 2, No 2, 165-184
Abstract:
Summary We investigate the optimal choice of clustering algorithms for multivariate data sets. We make use of algorithms that define partitions by maximal support planes (MSP) of a convex function and have been profoundly investigated by Pötzelberger and Strasser (2000). This is a wide range class containing as special cases both the well known k-means algorithm and the Kohonen (1985) algorithm. We compare the quality of the clustering procedures by first applying them to multivariate data sets and then treating a k-sample problem. For computing the test statistics the data points are replaced by their conditional expectations with respect to the MSP-partition. Monte Carlo simulations of power functions for tests that are carried out as multivariate permutation tests show a vital and decisive connection between the optimal choice of the algorithm and the tails of the probability distribution of the data. Especially for distributions with heavy tails the performance of k-means type algorithms totally breaks down.
Keywords: Data compression; Clustering; MSP-partitions; Multivariate Permutation tests; k-sample problem (search for similar items in EconPapers)
Date: 2002
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s001800200100 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:17:y:2002:i:2:d:10.1007_s001800200100
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s001800200100
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().