EconPapers    
Economics at your fingertips  
 

Nonparametric cluster significance testing with reference to a unimodal null distribution

Erika S. Helgeson, David M. Vock and Eric Bair

Biometrics, 2021, vol. 77, issue 4, 1215-1226

Abstract: Cluster analysis is an unsupervised learning strategy that is exceptionally useful for identifying homogeneous subgroups of observations in data sets of unknown structure. However, it is challenging to determine if the identified clusters represent truly distinct subgroups rather than noise. Existing approaches for addressing this problem tend to define clusters based on distributional assumptions, ignore the inherent correlation structure in the data, or are not suited for high‐dimension low‐sample size (HDLSS) settings. In this paper, we propose a novel method to evaluate the significance of identified clusters by comparing the explained variation due to the clustering from the original data to that produced by clustering a unimodal reference distribution that preserves the covariance structure in the data. The reference distribution is generated using kernel density estimation, and thus, does not require that the data follow a particular distribution. By utilizing sparse covariance estimation, the method is adapted for the HDLSS setting. The approach can be used to test the null hypothesis that the data cannot be partitioned into clusters and to determine the optimal number of clusters. Simulation examples, theoretical evaluations, and applications to temporomandibular disorder research and cancer microarray data illustrate the utility of the proposed method.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1111/biom.13376

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:biomet:v:77:y:2021:i:4:p:1215-1226

Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0006-341X

Access Statistics for this article

More articles in Biometrics from The International Biometric Society
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:biomet:v:77:y:2021:i:4:p:1215-1226