Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering
Massimo Bilancia (),
Michele Nanni (),
Fabio Manca () and
Gianvito Pio ()
Additional contact information
Massimo Bilancia: University of Bari Aldo Moro, Policlinic University Hospital
Michele Nanni: EY Business and Technology Solution
Fabio Manca: University of Bari Aldo Moro, Palazzo Chiaia - Napolitano
Gianvito Pio: University of Bari Aldo Moro
Computational Statistics, 2023, vol. 38, issue 4, No 19, 2015-2051
Abstract:
Abstract In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference through variational inference. We compute the explicit expression of the variational objective function for our hierarchical model under a mean-field approximation. We then derive the update equations of a suitable algorithm based on coordinate ascent to find local maxima of the variational target, and estimate the model parameters through the optimized variational hyperparameters. The advantages of variational algorithms over traditional Markov Chain Monte Carlo methods based on iterative posterior sampling are also discussed in detail.
Keywords: Text clustering; Finite mixture models; Dirichlet-multinomial distribution; Bayesian hierarchical modelling; Variational inference (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s00180-023-01350-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:38:y:2023:i:4:d:10.1007_s00180-023-01350-8
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s00180-023-01350-8
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().