Using four different online media sources to forecast the crude oil price
A. Fronzetti Colladon,
E. Battistoni and
P. A. Gloor
Papers from arXiv.org
This study looks for signals of economic awareness on online social media and tests their significance in economic predictions. The study analyses, over a period of two years, the relationship between the West Texas Intermediate daily crude oil price and multiple predictors extracted from Twitter, Google Trends, Wikipedia, and the Global Data on Events, Language, and Tone database (GDELT). Semantic analysis is applied to study the sentiment, emotionality and complexity of the language used. Autoregressive Integrated Moving Average with Explanatory Variable (ARIMAX) models are used to make predictions and to confirm the value of the study variables. Results show that the combined analysis of the four media platforms carries valuable information in making financial forecasting. Twitter language complexity, GDELT number of articles and Wikipedia page reads have the highest predictive power. This study also allows a comparison of the different fore-sighting abilities of each platform, in terms of how many days ahead a platform can predict a price movement before it happens. In comparison with previous work, more media sources and more dimensions of the interaction and of the language used are combined in a joint analysis.
New Economics Papers: this item is included in nep-big, nep-ene and nep-for
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Published in Journal of Information Science 44(3), 408-421 (2018)
Downloads: (external link)
http://arxiv.org/pdf/2105.09154 Latest version (application/pdf)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2105.09154
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().