A Hybrid Approach of Machine Learning and Lexicons to Sentiment Analysis: Enhanced Insights from Twitter Data of Natural Disasters
Shalak Mendon,
Pankaj Dutta (),
Abhishek Behl and
Stefan Lessmann
Additional contact information
Shalak Mendon: Electronic City
Pankaj Dutta: Indian Institute of Technology Bombay
Abhishek Behl: Indian Institute of Technology Bombay
Stefan Lessmann: Humboldt-Universität zu Berlin
Information Systems Frontiers, 2021, vol. 23, issue 5, No 6, 1145-1168
Abstract:
Abstract The success factor of sentimental analysis lies in identifying the most occurring and relevant opinions among users relating to the particular topic. In this paper, we develop a framework to analyze users’ sentiments on Twitter on natural disasters using the data pre-processing techniques and a hybrid of machine learning, statistical modeling, and lexicon-based approach. We choose TF-IDF and K-means for sentiment classification among affinitive and hierarchical clustering. Latent Dirichlet Allocation, a pipeline of Doc2Vec and K-means used to capture themes, then perform multi-level polarity indices classification and its time series analysis. In our study, we draw insights from 243,746 tweets for Kerala’s 2018 natural disasters in India. The key findings of the study are the classification of sentiments based on similarity and polarity indices and identifying themes among the topics discussed on Twitter. We observe different sets of emotions and influencers, among others. Through this case example of Kerala floods, it shows how the government and other organizations could track the positive/negative sentiments concerning time and location; gain a better understanding of the topic of discussion trending among the public, and collaborate with crucial Twitter users/influencers to spread and figure out the gaps in the implementation of schemes in terms of design and execution. This research’s uniqueness is the streamlined and efficient combination of algorithms and techniques embedded in the framework used in achieving the above output, which can be integrated into a platform with GUI for further automation.
Keywords: Sentimental analysis; K-means clustering; Latent Dirichlet allocation; Machine learning; Twitter; Natural disasters (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
http://link.springer.com/10.1007/s10796-021-10107-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:infosf:v:23:y:2021:i:5:d:10.1007_s10796-021-10107-x
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10796
DOI: 10.1007/s10796-021-10107-x
Access Statistics for this article
Information Systems Frontiers is currently edited by Ram Ramesh and Raghav Rao
More articles in Information Systems Frontiers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().