Structured Embedding Models for Grouped Data

Rudolph, Maja; Ruiz, Francisco; Athey, Susan; Blei, David

Structured Embedding Models for Grouped Data

Maja Rudolph, Francisco Ruiz, Susan Athey and David Blei
Additional contact information
Maja Rudolph: Columbia University
Francisco Ruiz: University of Cambridge
David Blei: Stanford University

Research Papers from Stanford University, Graduate School of Business

Abstract: Word embeddings are a powerful approach for analyzing language, and exponential family embeddings (EFE) extend them to other types of data. Here we develop structured exponential family embeddings (S-EFE), a method for discovering embeddings that vary across related groups of data. We study how the word usage of U.S. Congressional speeches varies across states and party affiliation, how words are used differently across sections of the ArXiv, and how the co-purchase patterns of groceries can vary across seasons. Key to the success of our method is that the groups share statistical information. We develop two sharing strategies: hierarchical modeling and amortization. We demonstrate the benefits of this approach in empirical studies of speeches, abstracts, and shopping baskets. We show how S-EFE enables group-specific interpretation of word usage, and outperforms EFE in predicting held-out data.

Date: 2017-09
References: Add references at CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
https://www.gsb.stanford.edu/gsb-cmis/gsb-cmis-download-auth/442661
Our link check indicates that this URL is bad, the error code is: 404 Not Found

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ecl:stabus:repec:ecl:stabus:3597

Access Statistics for this paper

More papers in Research Papers from Stanford University, Graduate School of Business Contact information at EDIRC.
Bibliographic data for series maintained by ().