EconPapers    
Economics at your fingertips  
 

Asset Embeddings

Xavier Gabaix, Ralph Koijen, Robert Richmond and Motohiro Yogo

No 20082, CEPR Discussion Papers from Centre for Economic Policy Research

Abstract: Firm characteristics, based on accounting and financial market data, are commonly used to represent firms in economics and finance. However, investors collectively use a much richer information set beyond firm characteristics, including sources of information that are not readily available to researchers. We show theoretically that portfolio holdings contain all relevant information for asset pricing, which can be recovered under empirically realistic conditions. Such guarantees do not exist for other data sources, such as accounting or text data. We build on recent advances in artificial intelligence (AI) and machine learning (ML) that represent unstructured data (e.g., text, audio, and images) by high-dimensional latent vectors called embeddings. Just as word embeddings leverage the document structure to represent words, asset embeddings leverage portfolio holdings to represent firms. Thus, this paper is a bridge from recent advances in AI and ML to economics and finance. We explore various methods to estimate asset embeddings, including recommender systems, shallow neural network models such as Word2Vec, and transformer models such as BERT. We evaluate the performance of these models on three benchmarks that can be evaluated using a single quarter of data: predicting relative valuations, explaining the comovement of stock returns, and predicting institutional portfolio decisions. We also estimate investor embeddings (i.e., representations of investors and their strategies), which are useful for investor classification, performance evaluation, and detecting crowded trades. We discuss other applications of asset embeddings, including generative portfolios, risk management, and stress testing. Finally, we develop a framework to give an economic narrative to a group of similar firms, by applying large language models to firm-level text data.

Keywords: Artificial intelligence; Asset pricing; Machine learning; Transformer models (search for similar items in EconPapers)
JEL-codes: C53 G12 G23 (search for similar items in EconPapers)
Date: 2025-03
References: Add references at CitEc
Citations:

Downloads: (external link)
https://cepr.org/publications/DP20082 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cpr:ceprdp:20082

Ordering information: This working paper can be ordered from
https://cepr.org/publications/DP20082

Access Statistics for this paper

More papers in CEPR Discussion Papers from Centre for Economic Policy Research 33 Great Sutton Street, London EC1V 0DX, UK.
Bibliographic data for series maintained by CEPR ().

 
Page updated 2026-05-29
Handle: RePEc:cpr:ceprdp:20082