Empirical Analysis of Machine Learning and Stacking Ensemble Methods for Heart Disease Detection

Sadhukhan, Bikash; Gupta, Pratick; Narayan, Atulya; Mourya, Akshay Kumar; Kumar, Shivam

Empirical Analysis of Machine Learning and Stacking Ensemble Methods for Heart Disease Detection

Bikash Sadhukhan, Pratick Gupta (), Atulya Narayan (), Akshay Kumar Mourya () and Shivam Kumar ()
Additional contact information
Bikash Sadhukhan: Department of Computer Science and Engineering, Techno International New Town, Kolkata 700156, India
Pratick Gupta: Department of Computer Science and Engineering, Techno International New Town, Kolkata 700156, India
Atulya Narayan: Department of Computer Science and Engineering, Techno International New Town, Kolkata 700156, India
Akshay Kumar Mourya: Department of Computer Science and Engineering, Techno International New Town, Kolkata 700156, India
Shivam Kumar: Department of Computer Science and Engineering, Techno International New Town, Kolkata 700156, India

Journal of Information & Knowledge Management (JIKM), 2025, vol. 24, issue 04, 1-26

Abstract: Cardiovascular diseases are prominent contributors to mortality worldwide, and timely identification is crucial for enhancing patient prognosis. The protracted and exhaustive diagnostic procedures that result in delayed diagnosis can culminate in precarious circumstances that are difficult or impossible to manage. The utilisation of machine learning (ML) methodologies has the potential to facilitate the timely prediction of heart disease based on specific medical reports, thereby affording individuals the convenience of conducting such assessments from the comfort of their own homes. Using a dataset consisting of medical records and clinical attributes, ten models were evaluated, including decision tree, K-nearest neighbours, gradient boosting, random forest, AdaBoost, support vector machine, logistic regression, naive Bayes, a hypertuned gradient boosting model, and a StackingCV ensemble model. Utilising performance metrics such as accuracy, precision, recall, F1-scores, and the ROC-AUC, their predictive capabilities were evaluated. The random forest classifier achieved an accuracy of 0.94, demonstrating its high discriminatory power in identifying cases of cardiovascular disease. With an accuracy of 0.91, the K-nearest neighbours model demonstrated its potential for accurate classification. Intriguingly, the hypertuned gradient boosting model significantly outperformed the baseline model, achieving an impressive accuracy of 0.96. Additionally, the StackingCV ensemble model demonstrated superior accuracy, recall, F1-scores, and an ROCâ€“AUC of 0.99, surpassing all the individual classifiers. These results demonstrate the effectiveness of ML algorithms in the detection of heart disease. The random forest classifier, the hypertuned gradient boosting model, and the StackingCV ensemble models demonstrate high accuracy and show promise for implementation in clinical settings.

Keywords: Heart disease; StackingCV; hypertuned gradient boosting; ensemble learning (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649225500285
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:24:y:2025:i:04:n:s0219649225500285

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219649225500285

Access Statistics for this article

Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh

More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().