Identifying the Main Risk Factors for Cardiovascular Diseases Prediction Using Machine Learning Algorithms
Luis Rolando Guarneros-Nolasco,
Nancy Aracely Cruz-Ramos,
Giner Alor-Hernández,
Lisbeth Rodríguez-Mazahua and
José Luis Sánchez-Cervantes
Additional contact information
Luis Rolando Guarneros-Nolasco: División de Estudios de Posgrado e Investigación, Tecnológico Nacional de México/I.T. Orizaba, Av. Oriente 9 No. 852 Col. Emiliano Zapata, Orizaba, Veracruz C.P. 94320, Mexico
Nancy Aracely Cruz-Ramos: División de Estudios de Posgrado e Investigación, Tecnológico Nacional de México/I.T. Orizaba, Av. Oriente 9 No. 852 Col. Emiliano Zapata, Orizaba, Veracruz C.P. 94320, Mexico
Giner Alor-Hernández: División de Estudios de Posgrado e Investigación, Tecnológico Nacional de México/I.T. Orizaba, Av. Oriente 9 No. 852 Col. Emiliano Zapata, Orizaba, Veracruz C.P. 94320, Mexico
Lisbeth Rodríguez-Mazahua: División de Estudios de Posgrado e Investigación, Tecnológico Nacional de México/I.T. Orizaba, Av. Oriente 9 No. 852 Col. Emiliano Zapata, Orizaba, Veracruz C.P. 94320, Mexico
José Luis Sánchez-Cervantes: CONACYT, Instituto Tecnológico de Orizaba, Av. Oriente 9 No. 852 Col. Emiliano Zapata, Orizaba, Veracruz C.P. 94320, Mexico
Mathematics, 2021, vol. 9, issue 20, 1-25
Abstract:
Cardiovascular Diseases (CVDs) are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. As an effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc—using the train-test split technique and k-fold cross-validation. Our study identifies the top-two and top-four attributes from CVD datasets analyzing the performance of the accuracy metrics to determine that they are the best for predicting and diagnosing CVD. As our main findings, the ten ML classifiers exhibited appropriate diagnosis in classification and predictive performance with accuracy metric with top-two attributes, identifying three main attributes for diagnosis and prediction of a CVD such as arrhythmia and tachycardia; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.
Keywords: big data; health prevention; machine learning; medical data (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/9/20/2537/pdf (application/pdf)
https://www.mdpi.com/2227-7390/9/20/2537/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:9:y:2021:i:20:p:2537-:d:652487
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().