Improved Bayesian inference for the stochastic block model with application to large networks
Aaron F. McDaid,
Thomas Brendan Murphy,
Nial Friel and
Neil J. Hurley
Computational Statistics & Data Analysis, 2013, vol. 60, issue C, 12-31
Abstract:
An efficient MCMC algorithm is presented to cluster the nodes of a network such that nodes with similar role in the network are clustered together. This is known as block-modeling or block-clustering. The model is the stochastic blockmodel (SBM) with block parameters integrated out. The resulting marginal distribution defines a posterior over the number of clusters and cluster memberships. Sampling from this posterior is simpler than from the original SBM as transdimensional MCMC can be avoided. The algorithm is based on the allocation sampler. It requires a prior to be placed on the number of clusters, thereby allowing the number of clusters to be directly estimated by the algorithm, rather than being given as an input parameter. Synthetic and real data are used to test the speed and accuracy of the model and algorithm, including the ability to estimate the number of clusters. The algorithm can scale to networks with up to ten thousand nodes and tens of millions of edges.
Keywords: Clustering; Social networks; Blockmodeling; Computational statistics; MCMC (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947312003891
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:60:y:2013:i:c:p:12-31
DOI: 10.1016/j.csda.2012.10.021
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().