On Bayesian Analysis of Parsimonious Gaussian Mixture Models
Xiang Lu (),
Yaoxiang Li () and
Tanzy Love ()
Additional contact information
Xiang Lu: Center for Devices and Radiological Health, U.S. Food and Drug Administration
Yaoxiang Li: Georgetown University
Tanzy Love: University of Rochester
Journal of Classification, 2021, vol. 38, issue 3, No 9, 576-593
Abstract:
Abstract Cluster analysis is the task of grouping a set of objects in such a way that objects in the same cluster are similar to each other. It is widely used in many fields including machine learning, bioinformatics, and computer graphics. In all of these applications, the partition is an inference goal, along with the number of clusters and their distinguishing characteristics. Mixtures of factor analyzers is a special case of model-based clustering which assumes the variance of each cluster comes from a factor analysis model. It simplifies the Gaussian mixture model through parameter dimension reduction and conceptually represents the variables as coming from a lower dimensional subspace where the clusters are separate. In this paper, we introduce a new RJMCMC (reversible-jump Markov chain Monte Carlo) inferential procedure for the family of constrained MFA models. The three goals of inference here are the partition of the objects, estimation of the number of clusters, and identification and estimation of the covariance structure of the clusters; each therefore has posterior distributions. RJMCMC is the major sampling tool, which allows the dimension of the parameters to be estimated. We present simulations comparing the estimation of the clustering parameters and the partition between this inferential technique and previous methods. Finally, we illustrate these new methods with a dataset of DNA methylation measures for subjects with different brain tumor types. Our method uses four latent factors to correctly discover the five brain tumor types without assuming a constant variance structure and it classifies subjects with an excellent classification performance.
Keywords: Mixture models; Factor analysis; Cluster analysis; Model-based clustering; RJMCMC; Bayesian clustering (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s00357-021-09391-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:jclass:v:38:y:2021:i:3:d:10.1007_s00357-021-09391-8
Ordering information: This journal article can be ordered from
http://www.springer. ... hods/journal/357/PS2
DOI: 10.1007/s00357-021-09391-8
Access Statistics for this article
Journal of Classification is currently edited by Douglas Steinley
More articles in Journal of Classification from Springer, The Classification Society
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().