EconPapers    
Economics at your fingertips  
 

ldagibbs: A command for topic modeling in Stata using latent Dirichlet allocation

Carlo Schwarz

Stata Journal, 2018, vol. 18, issue 1, 101-117

Abstract: In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.

Keywords: ldagibbs; machine learning; latent Dirichlet allocation; Gibbs sampling; topic model; text analysis (search for similar items in EconPapers)
Date: 2018
Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj18-1/st0515/
References: Add references at CitEc
Citations: View citations in EconPapers (14)

Downloads: (external link)
http://www.stata-journal.com/article.html?article=st0515 link to article purchase

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:18:y:2018:i:1:p:101-117

Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html

Access Statistics for this article

Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins

More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().

 
Page updated 2025-03-22
Handle: RePEc:tsj:stataj:v:18:y:2018:i:1:p:101-117