Model based clustering of high-dimensional binary data
Yang Tang,
Ryan P. Browne and
Paul D. McNicholas
Computational Statistics & Data Analysis, 2015, vol. 87, issue C, 84-101
Abstract:
A mixture of latent trait models with common slope parameters for model-based clustering of high-dimensional binary data, a data type for which few established methods exist, is proposed. Recent work on clustering of binary data, based on a d-dimensional Gaussian latent variable, is extended by incorporating common factor analyzers. Accordingly, this approach facilitates a low-dimensional visual representation of the clusters. The model is further extended by the incorporation of random block effects. The dependencies in each block are taken into account through block-specific parameters that are considered to be random variables. A variational approximation to the likelihood is exploited to derive a fast algorithm for determining the model parameters. Real and simulated data are used to demonstrate this approach.
Keywords: Binary data; Clustering; Data visualization; High dimension; Latent variables; Mixture models (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947314003570
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:87:y:2015:i:c:p:84-101
DOI: 10.1016/j.csda.2014.12.009
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().