EconPapers    
Economics at your fingertips  
 

Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19

Vito Janko, Gašper Slapničar, Erik Dovgan, Nina Reščič, Tine Kolenik, Martin Gjoreski, Maj Smerkol, Matjaž Gams and Mitja Luštrek
Additional contact information
Vito Janko: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Gašper Slapničar: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Erik Dovgan: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Nina Reščič: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Tine Kolenik: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Martin Gjoreski: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Maj Smerkol: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Matjaž Gams: Jožef Stefan Institute, 1000 Ljubljana, Slovenia
Mitja Luštrek: Jožef Stefan Institute, 1000 Ljubljana, Slovenia

IJERPH, 2021, vol. 18, issue 13, 1-33

Abstract: The COVID-19 pandemic affected the whole world, but not all countries were impacted equally. This opens the question of what factors can explain the initial faster spread in some countries compared to others. Many such factors are overshadowed by the effect of the countermeasures, so we studied the early phases of the infection when countermeasures had not yet taken place. We collected the most diverse dataset of potentially relevant factors and infection metrics to date for this task. Using it, we show the importance of different factors and factor categories as determined by both statistical methods and machine learning (ML) feature selection (FS) approaches. Factors related to culture (e.g., individualism, openness), development, and travel proved the most important. A more thorough factor analysis was then made using a novel rule discovery algorithm. We also show how interconnected these factors are and caution against relying on ML analysis in isolation. Importantly, we explore potential pitfalls found in the methodology of similar work and demonstrate their impact on COVID-19 data analysis. Our best models using the decision tree classifier can predict the infection class with roughly 80% accuracy.

Keywords: COVID-19; machine learning; feature significance; feature correlation; risk factors (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1660-4601/18/13/6750/pdf (application/pdf)
https://www.mdpi.com/1660-4601/18/13/6750/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:18:y:2021:i:13:p:6750-:d:580451

Access Statistics for this article

IJERPH is currently edited by Ms. Jenna Liu

More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jijerp:v:18:y:2021:i:13:p:6750-:d:580451