Predicting Returns With Text Data
Zheng Tracy Ke,
Bryan T. Kelly and
Dacheng Xiu
No 26186, NBER Working Papers from National Bureau of Economic Research, Inc
Abstract:
We introduce a new text-mining methodology that extracts sentiment information from news articles to predict asset returns. Unlike more common sentiment scores used for stock return prediction (e.g., those sold by commercial vendors or built with dictionary-based methods), our supervised learning framework constructs a sentiment score that is specifically adapted to the problem of return prediction. Our method proceeds in three steps: 1) isolating a list of sentiment terms via predictive screening, 2) assigning sentiment weights to these words via topic modeling, and 3) aggregating terms into an article-level sentiment score via penalized likelihood. We derive theoretical guarantees on the accuracy of estimates from our model with minimal assumptions. In our empirical analysis, we text-mine one of the most actively monitored streams of news articles in the financial system|the Dow Jones Newswires|and show that our supervised sentiment model excels at extracting return-predictive signals in this context.
JEL-codes: C53 C55 C58 G10 G11 G12 G14 G17 G4 (search for similar items in EconPapers)
Date: 2019-08
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ecm, nep-fmk, nep-for, nep-ore and nep-pay
Note: AP
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (46)
Downloads: (external link)
http://www.nber.org/papers/w26186.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nbr:nberwo:26186
Ordering information: This working paper can be ordered from
http://www.nber.org/papers/w26186
Access Statistics for this paper
More papers in NBER Working Papers from National Bureau of Economic Research, Inc National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.. Contact information at EDIRC.
Bibliographic data for series maintained by ().