Is Deep-Learning and Natural Language Processing Transcending the Financial Forecasting? Investigation Through Lens of News Analytic Process

Khalil, Faisal; Pipa, Gordon

Is Deep-Learning and Natural Language Processing Transcending the Financial Forecasting? Investigation Through Lens of News Analytic Process

Faisal Khalil () and Gordon Pipa ()
Additional contact information
Faisal Khalil: Institute of Cognitive Science
Gordon Pipa: Institute of Cognitive Science

Computational Economics, 2022, vol. 60, issue 1, No 7, 147-171

Abstract: Abstract This study tries to unravel the stock market prediction puzzle using the textual analytic with the help of natural language processing (NLP) techniques and Deep-learning recurrent model called long short term memory (LSTM). Instead of using count-based traditional sentiment index methods, the study uses its own sum and relevance based sentiment index mechanism. Hourly price data has been used in this research as daily data is too late and minutes data is too early for getting the exclusive effect of sentiments. Normally, hourly data is extremely costly and difficult to manage and analyze. Hourly data has been rarely used in similar kinds of researches. To built sentiment index, text analytic information has been parsed and analyzed, textual information that is relevant to selected stocks has been collected, aggregated, categorized, and refined with NLP and eventually converted scientifically into hourly sentiment index. News analytic sources include mainstream media, print media, social media, news feeds, blogs, investors’ advisory portals, experts’ opinions, brokers updates, web-based information, company’ internal news and public announcements regarding policies and reforms. The results of the study indicate that sentiments significantly influence the direction of stocks, on average after 3–4 h. Top ten companies from High-tech, financial, medical, automobile sectors are selected, and six LSTM models, three for using text-analytic and other without analytic are used. Every model includes 1, 3, and 6 h steps back. For all sectors, a 6-hour steps based model outperforms the other models due to LSTM specialty of keeping long term memory. Collective accuracy of textual analytic models is way higher relative to non-textual analytic models.

Keywords: LSTM; Natural language processing; News analytic; Sentiment analysis; Stock prediction (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://link.springer.com/10.1007/s10614-021-10145-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:kap:compec:v:60:y:2022:i:1:d:10.1007_s10614-021-10145-2

Ordering information: This journal article can be ordered from
http://www.springer. ... ry/journal/10614/PS2

DOI: 10.1007/s10614-021-10145-2

Access Statistics for this article

Computational Economics is currently edited by Hans Amman

More articles in Computational Economics from Springer, Society for Computational Economics Contact information at EDIRC.
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().