EconPapers    
Economics at your fingertips  
 

Random Forest and Feature Importance Measures for Discriminating the Most Influential Environmental Factors in Predicting Cardiovascular and Respiratory Diseases

Francesco Cappelli (), Gianfranco Castronuovo (), Salvatore Grimaldi and Vito Telesca
Additional contact information
Francesco Cappelli: DIBAF Department, University of Tuscia, 01100 Viterbo, Italy
Gianfranco Castronuovo: School of Engineering, University of Basilicata, Viale dell’Ateneo Lucano 10, 85100 Potenza, Italy
Salvatore Grimaldi: DIBAF Department, University of Tuscia, 01100 Viterbo, Italy
Vito Telesca: School of Engineering, University of Basilicata, Viale dell’Ateneo Lucano 10, 85100 Potenza, Italy

IJERPH, 2024, vol. 21, issue 7, 1-21

Abstract: Background: Several studies suggest that environmental and climatic factors are linked to the risk of mortality due to cardiovascular and respiratory diseases; however, it is still unclear which are the most influential ones. This study sheds light on the potentiality of a data-driven statistical approach by providing a case study analysis. Methods: Daily admissions to the emergency room for cardiovascular and respiratory diseases are jointly analyzed with daily environmental and climatic parameter values (temperature, atmospheric pressure, relative humidity, carbon monoxide, ozone, particulate matter, and nitrogen dioxide). The Random Forest (RF) model and feature importance measure (FMI) techniques (permutation feature importance (PFI), Shapley Additive exPlanations (SHAP) feature importance, and the derivative-based importance measure ( κ A L E )) are applied for discriminating the role of each environmental and climatic parameter. Data are pre-processed to remove trend and seasonal behavior using the Seasonal Trend Decomposition (STL) method and preliminary analyzed to avoid redundancy of information. Results: The RF performance is encouraging, being able to predict cardiovascular and respiratory disease admissions with a mean absolute relative error of 0.04 and 0.05 cases per day, respectively. Feature importance measures discriminate parameter behaviors providing importance rankings. Indeed, only three parameters (temperature, atmospheric pressure, and carbon monoxide) were responsible for most of the total prediction accuracy. Conclusions: Data-driven and statistical tools, like the feature importance measure, are promising for discriminating the role of environmental and climatic factors in predicting the risk related to cardiovascular and respiratory diseases. Our results reveal the potential of employing these tools in public health policy applications for the development of early warning systems that address health risks associated with climate change, and improving disease prevention strategies.

Keywords: Feature Importance Measures; Machine Learning; Interpretability; Public Health; Cardiovascular Diseases; Respiratory Diseases (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/1660-4601/21/7/867/pdf (application/pdf)
https://www.mdpi.com/1660-4601/21/7/867/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:21:y:2024:i:7:p:867-:d:1427726

Access Statistics for this article

IJERPH is currently edited by Ms. Jenna Liu

More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jijerp:v:21:y:2024:i:7:p:867-:d:1427726