Assessing the Nationwide COVID-19 Risk in Mexico through the Lens of Comorbidity by an XGBoost-Based Logistic Regression Model
Sonia Venancio-Guzmán,
Alejandro Ivan Aguirre-Salado (),
Carlos Soubervielle-Montalvo and
José del Carmen Jiménez-Hernández
Additional contact information
Sonia Venancio-Guzmán: Institute of Physics and Mathematics, Universidad Tecnológica de la Mixteca, Huajuapan de León C.P. 69000, Mexico
Alejandro Ivan Aguirre-Salado: Institute of Physics and Mathematics, Universidad Tecnológica de la Mixteca, Huajuapan de León C.P. 69000, Mexico
Carlos Soubervielle-Montalvo: Faculty of Engineering, Universidad Autónoma de San Luis Potosí, San Luis Potosí C.P. 78280, Mexico
José del Carmen Jiménez-Hernández: Institute of Physics and Mathematics, Universidad Tecnológica de la Mixteca, Huajuapan de León C.P. 69000, Mexico
IJERPH, 2022, vol. 19, issue 19, 1-19
Abstract:
The outbreak of the new COVID-19 disease is a serious health problem that has affected a large part of the world population, especially older adults and people who suffer from a previous comorbidity. In this work, we proposed a classifier model that allows for deciding whether or not a patient might suffer from the COVID-19 disease, considering spatio-temporal variables, physical characteristics of the patients and the presence of previous diseases. We used XGBoost to maximize the likelihood function of the multivariate logistic regression model. The estimated and observed values of percentage occurrence of cases were very similar, and indicated that the proposed model was suitable to predict new cases (AUC = 0.75 ). The main results revealed that patients without comorbidities are less likely to be COVID-19 positive, unlike people with diabetes, obesity and pneumonia. The distribution function by age group showed that, during the first and second wave of COVID-19, young people aged ≤ 20 were the least affected by the pandemic, while the most affected were people between 20 and 40 years, followed by adults older than 40 years. In the case of the third and fourth wave, there was an increased risk for young individuals (under 20 years), while older adults over 40 years decreased their chances of infection. Estimates of positive COVID cases with both the XGBoost-LR model and the multivariate logistic regression model were used to create maps to visualize the spatial distribution of positive cases across the country. Spatial analysis was carried out to determine, through the data, the main geographical areas where a greater number of positive cases occurred. The results showed that the areas most affected by COVID-19 were in the central and northern regions of Mexico.
Keywords: coronavirus; comorbidity; spatial analysis; logistic regression; ROC curve (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1660-4601/19/19/11992/pdf (application/pdf)
https://www.mdpi.com/1660-4601/19/19/11992/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:19:y:2022:i:19:p:11992-:d:922011
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().