Early Warning of Infectious Disease Outbreaks Using Social Media and Digital Data: A Scoping Review
Yamil Liscano (),
Luis A. Anillo Arrieta,
John Fernando Montenegro,
Diego Prieto-Alvarado and
Jorge Ordoñez
Additional contact information
Yamil Liscano: Grupo de Investigación en Salud Integral (GISI), Departamento Facultad de Salud, Universidad Santiago de Cali, Cali 760035, Colombia
Luis A. Anillo Arrieta: School of Basic Sciences, Technology, and Engineering, Universidad Nacional Abierta y a Distancia–UNAD, Barranquilla 080005, Colombia
John Fernando Montenegro: Grupo de Investigación en Salud Integral (GISI), Departamento Facultad de Salud, Universidad Santiago de Cali, Cali 760035, Colombia
Diego Prieto-Alvarado: Grupo de Investigación en Salud Integral (GISI), Departamento Facultad de Salud, Universidad Santiago de Cali, Cali 760035, Colombia
Jorge Ordoñez: Grupo de Investigación en Salud Integral (GISI), Departamento Facultad de Salud, Universidad Santiago de Cali, Cali 760035, Colombia
IJERPH, 2025, vol. 22, issue 7, 1-34
Abstract:
Background and Aim: Digital surveillance, which utilizes data from social media, search engines, and other online platforms, has emerged as an innovative approach for the early detection of infectious disease outbreaks. This scoping review aimed to systematically map and characterize the methodologies, performance metrics, and limitations of digital surveillance tools compared to traditional epidemiological monitoring. Methods : A scoping review was conducted in accordance with the Joanna Briggs Institute and PRISMA-SCR guidelines. Scientific databases including PubMed, Scopus, and Web of Science were searched, incorporating both empirical studies and systematic reviews without language restrictions. Key elements analyzed included digital sources, analytical algorithms, accuracy metrics, and validation against official surveillance data. Results : The reviewed studies demonstrate that digital surveillance can provide significant lead times (from days to several weeks) compared to traditional systems. While performance varies by platform and disease, many models showed strong correlations (r > 0.8) with official case data and achieved low predictive errors, particularly for influenza and COVID-19. Google Trends and X (formerly Twitter) emerged as the most frequently used sources, often analyzed using supervised regression, Bayesian models, and ARIMA techniques. Conclusions : While digital surveillance shows strong predictive capabilities, it faces challenges related to data quality and representativeness. Key recommendations include the development of standardized reporting guidelines to improve comparability across studies, the use of statistical techniques like stratification and model weighting to mitigate demographic biases, and leveraging advanced artificial intelligence to differentiate genuine health signals from media-driven noise. These steps are crucial for enhancing the reliability and equity of digital epidemiological monitoring.
Keywords: disease outbreaks; epidemiological surveillance; social media; infodemiology; artificial intelligence; time series analysis (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1660-4601/22/7/1104/pdf (application/pdf)
https://www.mdpi.com/1660-4601/22/7/1104/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:22:y:2025:i:7:p:1104-:d:1700831
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().