EconPapers    
Economics at your fingertips  
 

Clustering using objective functions and stochastic search

James G. Booth, George Casella and James P. Hobert

Journal of the Royal Statistical Society Series B, 2008, vol. 70, issue 1, 119-139

Abstract: Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share cluster‐specific random effects. The inclusion of cluster‐specific random effects allows parsimonious departure from an assumed base model for cluster mean profiles. This departure is captured statistically via the posterior expectation, or best linear unbiased predictor. One of the parameters in the model is the true underlying partition of the data, and the posterior distribution of this parameter, which is known up to a normalizing constant, is used to cluster the data. The problem of finding partitions with high posterior probability is not amenable to deterministic methods such as the EM algorithm. Thus, we propose a stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters. The methodology proposed is fundamentally different from the well‐known finite mixture model approach to clustering, which does not explicitly include the partition as a parameter, and involves an independent and identically distributed structure.

Date: 2008
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (12)

Downloads: (external link)
https://doi.org/10.1111/j.1467-9868.2007.00629.x

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jorssb:v:70:y:2008:i:1:p:119-139

Ordering information: This journal article can be ordered from
http://ordering.onli ... 1111/(ISSN)1467-9868

Access Statistics for this article

Journal of the Royal Statistical Society Series B is currently edited by P. Fryzlewicz and I. Van Keilegom

More articles in Journal of the Royal Statistical Society Series B from Royal Statistical Society Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jorssb:v:70:y:2008:i:1:p:119-139