Weakly Supervised and Online Learning of Word Models for Classification to Detect Disaster Reporting Tweets
Girish Keshav Palshikar (),
Manoj Apte () and
Deepak Pandita ()
Additional contact information
Girish Keshav Palshikar: Tata Consultancy Services Limited
Manoj Apte: Tata Consultancy Services Limited
Deepak Pandita: University of Rochester
Information Systems Frontiers, 2018, vol. 20, issue 5, No 5, 949-959
Abstract:
Abstract Social media has quickly established itself as an important means that people, NGOs and governments use to spread information during natural or man-made disasters, mass emergencies and crisis situations. Given this important role, real-time analysis of social media contents to locate, organize and use valuable information for disaster management is crucial. In this paper, we propose self-learning algorithms that, with minimal supervision, construct a simple bag-of-words model of information expressed in the news about various natural disasters. Such a model is human-understandable, human-modifiable and usable in a real-time scenario. Since tweets are a different category of documents than news, we next propose a model transfer algorithm, which essentially refines the model learned from news by analyzing a large unlabeled corpus of tweets. We show empirically that model transfer improves the predictive accuracy of the model. We demonstrate empirically that our model learning algorithm is better than several state of the art semi-supervised learning algorithms. Finally, we present an online algorithm that learns the weights for words in the model and demonstrate the efficacy of the model with word weights.
Keywords: Machine learning; Text classification; Weakly supervised learning; Online learning; Transfer learning; Tweet classification; Disaster management (search for similar items in EconPapers)
Date: 2018
References: View complete reference list from CitEc
Citations: View citations in EconPapers (8)
Downloads: (external link)
http://link.springer.com/10.1007/s10796-018-9830-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:infosf:v:20:y:2018:i:5:d:10.1007_s10796-018-9830-2
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10796
DOI: 10.1007/s10796-018-9830-2
Access Statistics for this article
Information Systems Frontiers is currently edited by Ram Ramesh and Raghav Rao
More articles in Information Systems Frontiers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().