EconPapers    
Economics at your fingertips  
 

TBCOV: Two Billion Multilingual COVID-19 Tweets with Sentiment, Entity, Geo, and Gender Labels

Muhammad Imran, Umair Qazi and Ferda Ofli
Additional contact information
Muhammad Imran: Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar
Umair Qazi: Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar
Ferda Ofli: Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar

Data, 2022, vol. 7, issue 1, 1-27

Abstract: As the world struggles with several compounded challenges caused by the COVID-19 pandemic in the health, economic, and social domains, timely access to disaggregated national and sub-national data are important to understand the emergent situation but it is difficult to obtain. The widespread usage of social networking sites, especially during mass convergence events, such as health emergencies, provides instant access to citizen-generated data offering rich information about public opinions, sentiments, and situational updates useful for authorities to gain insights. We offer a large-scale social sensing dataset comprising two billion multilingual tweets posted from 218 countries by 87 million users in 67 languages. We used state-of-the-art machine learning models to enrich the data with sentiment labels and named-entities. Additionally, a gender identification approach is proposed to segregate user gender. Furthermore, a geolocalization approach is devised to geotag tweets at country, state, county, and city granularities, enabling a myriad of data analysis tasks to understand real-world issues at national and sub-national levels. We believe this multilingual data with broader geographical and longer temporal coverage will be a cornerstone for researchers to study impacts of the ongoing global health catastrophe and to manage adverse consequences related to people’s health, livelihood, and social well-being.

Keywords: social sensing; COVID-19; sentiment analysis; trends analysis; geo-mapping; natural cities (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/7/1/8/pdf (application/pdf)
https://www.mdpi.com/2306-5729/7/1/8/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:7:y:2022:i:1:p:8-:d:721364

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jdataj:v:7:y:2022:i:1:p:8-:d:721364