Using text mining to track outbreak trends in global surveillance of emerging diseases: ProMED‐mail
Jingxian You,
Paul Expert and
Céire Costelloe
Journal of the Royal Statistical Society Series A, 2021, vol. 184, issue 4, 1245-1259
Abstract:
ProMED‐mail (Program for Monitoring Emerging Disease) is an international disease outbreak monitoring and early warning system. Every year, users contribute thousands of reports that include reference to infectious diseases and toxins. However, due to the uneven distribution of the reports for each disease, traditional statistics‐based text mining techniques, represented by term frequency‐related algorithm, are not suitable. Thus, we conducted a study in three steps (i) report filtering, (ii) keyword extraction from reports and finally (iii) word co‐occurrence network analysis to fill the gap between ProMED and its utilization. The keyword extraction was performed with the TextRank algorithm, keywords co‐occurrence networks were then produced using the top keywords from each document and multiple network centrality measures were computed to analyse the co‐occurrence networks. We used two major outbreaks in recent years, Ebola, 2014 and Zika 2015, as cases to illustrate and validate the process. We found that the extracted information structures are consistent with World Health Organisation description of the timeline and phases of the epidemics. Our research presents a pipeline that can extract and organize the information to characterize the evolution of epidemic outbreaks. It also highlights the potential for ProMED to be utilized in monitoring, evaluating and improving responses to outbreaks.
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://doi.org/10.1111/rssa.12721
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jorssa:v:184:y:2021:i:4:p:1245-1259
Ordering information: This journal article can be ordered from
http://ordering.onli ... 1111/(ISSN)1467-985X
Access Statistics for this article
Journal of the Royal Statistical Society Series A is currently edited by A. Chevalier and L. Sharples
More articles in Journal of the Royal Statistical Society Series A from Royal Statistical Society Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().