EconPapers    
Economics at your fingertips  
 

Bayesian nonparametric clustering as a community detection problem

Stefano F. Tonellato

Computational Statistics & Data Analysis, 2020, vol. 152, issue C

Abstract: A wide class of Bayesian nonparametric priors leads to the representation of the distribution of the observable variables as a mixture density with an infinite number of components. Such a representation induces a clustering structure in the data. However, due to label switching, cluster identification is not straightforward a posteriori and some post-processing of the MCMC output is usually required. Alternatively, observations can be mapped on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It is shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation, leading to the minimisation of the expected description length of the partition. A relevant feature of this method is that it allows for the quantification of the posterior uncertainty of the classification.

Keywords: Dirichlet process priors; Mixture models; Community detection; Entropy; Clustering uncertainty (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947320301353
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301353

DOI: 10.1016/j.csda.2020.107044

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301353