Integrating Prediction and Attribution to Classify News
Nelson P. Rayl and
Nitish R. Sinha
Additional contact information
Nitish R. Sinha: https://www.federalreserve.gov/econres/nitish-r-sinha.htm
No 2022-042, Finance and Economics Discussion Series from Board of Governors of the Federal Reserve System (U.S.)
Abstract:
Recent modeling developments have created tradeoffs between attribution-based models, models that rely on causal relationships, and “pure prediction models†such as neural networks. While forecasters have historically favored one technology or the other based on comfort or loyalty to a particular paradigm, in domains with many observations and predictors such as textual analysis, the tradeoffs between attribution and prediction have become too large to ignore. We document these tradeoffs in the context of relabeling 27 million Thomson Reuters news articles published between 1996 and 2021 as debt-related or non-debt related. Articles in our dataset were labeled by journalists at the time of publication, but these labels may be inconsistent as labeling standards and the relation between text and label has changed over time. We propose a method for identifying and correcting inconsistent labeling that combines attribution and pure prediction methods and is applicable to any domain with human-labeled data. Implementing our proposed labeling solution returns a debt-related news dataset with 54% more observations than if the original journalist labels had been used and 31% more observation than if our solution had been implemented using attribution-based methods only.
Keywords: News; Text Analysis; Debt; Labeling; Supervised Learning; DMR (search for similar items in EconPapers)
JEL-codes: C40 C45 C55 (search for similar items in EconPapers)
Pages: 44 p.
Date: 2022-07-01
New Economics Papers: this item is included in nep-big and nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.federalreserve.gov/econres/feds/files/2022042pap.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:fip:fedgfe:2022-42
DOI: 10.17016/FEDS.2022.042
Access Statistics for this paper
More papers in Finance and Economics Discussion Series from Board of Governors of the Federal Reserve System (U.S.) Contact information at EDIRC.
Bibliographic data for series maintained by Ryan Wolfslayer ; Keisha Fournillier ().