EconPapers    
Economics at your fingertips  
 

Acceptable set topic modeling

Lauren Berk Wheelock and Dessislava A. Pachamanova

European Journal of Operational Research, 2022, vol. 299, issue 2, 653-673

Abstract: Topic modeling is a significant branch of natural language processing and machine learning focused on inferring the generative process of text. Traditionally, algorithms for estimating topic models have relied on Bayesian inference and Gibbs sampling. This paper proposes a novel “acceptable set” framework for formulating topic modeling problems inspired by ideas from discrete component analysis and data-driven robust optimization. Our approach not only simplifies the design and inference of topic models, but also allows for extensions and generalizations that are challenging to integrate into traditional approaches. Different restrictions (e.g., sparsity) and assumptions (e.g., alternative generative processes) can be easily incorporated into our formulations through additional or modified constraints. Our formulations also naturally control a widely used metric of solution quality, perplexity. We adapt state-of-the-art stochastic gradient methods to find good local optima for the optimization formulations. The algorithms are efficient, scaling to realistic problem sizes with runtimes comparable to existing methods. Through extensive computational experiments, we show that our methods have improved solution quality compared to baseline methods and reconstruct more reliably the underlying generative models. Our framework overcomes known vulnerabilities of traditional topic modeling algorithms: our methods are effective in low-data settings, register good out-of-sample performance, and perform well for a variety of initial assumptions on input parameter values.

Keywords: Topic modeling; Robust optimization; Discrete component analysis; Mirror descent; Hypothesis testing (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0377221721009759
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ejores:v:299:y:2022:i:2:p:653-673

DOI: 10.1016/j.ejor.2021.11.024

Access Statistics for this article

European Journal of Operational Research is currently edited by Roman Slowinski, Jesus Artalejo, Jean-Charles. Billaut, Robert Dyson and Lorenzo Peccati

More articles in European Journal of Operational Research from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:ejores:v:299:y:2022:i:2:p:653-673