EconPapers    
Economics at your fingertips  
 

Using text mining to track outbreak trends in global surveillance of emerging diseases: ProMED‐mail

Jingxian You, Paul Expert and Céire Costelloe

Journal of the Royal Statistical Society Series A, 2021, vol. 184, issue 4, 1245-1259

Abstract: ProMED‐mail (Program for Monitoring Emerging Disease) is an international disease outbreak monitoring and early warning system. Every year, users contribute thousands of reports that include reference to infectious diseases and toxins. However, due to the uneven distribution of the reports for each disease, traditional statistics‐based text mining techniques, represented by term frequency‐related algorithm, are not suitable. Thus, we conducted a study in three steps (i) report filtering, (ii) keyword extraction from reports and finally (iii) word co‐occurrence network analysis to fill the gap between ProMED and its utilization. The keyword extraction was performed with the TextRank algorithm, keywords co‐occurrence networks were then produced using the top keywords from each document and multiple network centrality measures were computed to analyse the co‐occurrence networks. We used two major outbreaks in recent years, Ebola, 2014 and Zika 2015, as cases to illustrate and validate the process. We found that the extracted information structures are consistent with World Health Organisation description of the timeline and phases of the epidemics. Our research presents a pipeline that can extract and organize the information to characterize the evolution of epidemic outbreaks. It also highlights the potential for ProMED to be utilized in monitoring, evaluating and improving responses to outbreaks.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.1111/rssa.12721

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jorssa:v:184:y:2021:i:4:p:1245-1259

Ordering information: This journal article can be ordered from
http://ordering.onli ... 1111/(ISSN)1467-985X

Access Statistics for this article

Journal of the Royal Statistical Society Series A is currently edited by A. Chevalier and L. Sharples

More articles in Journal of the Royal Statistical Society Series A from Royal Statistical Society Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jorssa:v:184:y:2021:i:4:p:1245-1259