On the Testability of the Anchor-Words Assumption in Topic Models
Simon Freyaldenhoven,
Shikun Ke (),
Dingyi Li () and
Jose Luis Montiel Olea ()
No 25-14, Working Papers from Federal Reserve Bank of Philadelphia
Abstract:
What does the Fed talk about in its monetary policy discussions? We introduce a new statistical methodology to analyze text documents, and we use that methodology to recover the topics discussed during FOMC meetings. Topic models are a simple and popular tool for the statistical analysis of textual data. Their identification and estimation are typically enabled by assuming the existence of anchor words; that is, words that are exclusive to specific topics. In this paper we show that the existence of anchor words is statistically testable: There exists a hypothesis test with correct size that has nontrivial power. This means that the anchor-words assumption cannot be viewed simply as a convenient normalization. Central to our results is a simple characterization of when a column-stochastic matrix with known nonnegative rank admits a separable factorization. We test for the existence of anchor words in two different datasets derived from monetary policy discussions in the Federal Reserve and reject the null hypothesis that anchor words exist in one of them.
Keywords: Anchor Words; Topic Models; Nonnegative Matrix Factorization; Hypothesis Testing (search for similar items in EconPapers)
JEL-codes: C39 C55 (search for similar items in EconPapers)
Pages: 78
Date: 2025-03-19
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.philadelphiafed.org/-/media/FRBP/Asset ... ers/2025/wp25-14.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:fip:fedpwp:99851
Ordering information: This working paper can be ordered from
DOI: 10.21799/frbp.wp.2025.14
Access Statistics for this paper
More papers in Working Papers from Federal Reserve Bank of Philadelphia Contact information at EDIRC.
Bibliographic data for series maintained by Beth Paul ().