Hidden Markov model with Pitman-Yor priors for probabilistic topic model
Jianjie Guo,
Lin Guo,
Wenchao Xu and
Haibin Zhang
Communications in Statistics - Theory and Methods, 2025, vol. 54, issue 9, 2791-2805
Abstract:
Empirical studies of natural language have demonstrated that word frequencies follow power law distributions. However, standard statistical models often fail to capture this property. The Pitman-Yor process (PYP), a Bayesian non parametric model capable of generating power law distributions, has been widely used in probabilistic topic models to handle data with an infinite number of components. However, existing PYP topic models rarely account for the relationships between topics. Hidden Markov models (HMMs) are popular models for modeling topic relationships. To address this limitation, we propose a probabilistic topic model that combines HMM with Pitman-Yor priors. The posterior inference was performed by using variational Bayes methods. We applied our method to text categorization and compared it with two related topic models: the hidden Markov topic model and hierarchical PYP topic model.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2024.2370920 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:54:y:2025:i:9:p:2791-2805
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20
DOI: 10.1080/03610926.2024.2370920
Access Statistics for this article
Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe
More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().