Hybrid Random Feature Selection and Recurrent Neural Network for Diabetes Prediction

Olaniran, Oyebayo Ridwan; Sikiru, Aliu Omotayo; Allohibi, Jeza; Alharbi, Abdulmajeed Atiah; Alharbi, Nada MohammedSaeed

Hybrid Random Feature Selection and Recurrent Neural Network for Diabetes Prediction

Oyebayo Ridwan Olaniran (), Aliu Omotayo Sikiru, Jeza Allohibi, Abdulmajeed Atiah Alharbi and Nada MohammedSaeed Alharbi
Additional contact information
Oyebayo Ridwan Olaniran: Department of Statistics, Faculty of Physical Sciences, University of Ilorin, llorin 1515, Nigeria
Aliu Omotayo Sikiru: Department of Statistics, Faculty of Physical Sciences, University of Ilorin, llorin 1515, Nigeria
Jeza Allohibi: Department of Mathematics, Faculty of Science, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia
Abdulmajeed Atiah Alharbi: Department of Mathematics, Faculty of Science, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia
Nada MohammedSaeed Alharbi: Department of Mathematics, Faculty of Science, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia

Mathematics, 2025, vol. 13, issue 4, 1-25

Abstract: This paper proposes a novel two-stage ensemble framework combining Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) with randomized feature selection to enhance diabetes prediction accuracy and calibration. The method first trains multiple LSTM/BiLSTM base models on dynamically sampled feature subsets to promote diversity, followed by a meta-learner that integrates predictions into a final robust output. A systematic simulation study conducted reveals that feature selection proportion critically impacts generalization: mid-range values (0.5–0.8 for LSTM; 0.6–0.8 for BiLSTM) optimize performance, while values close to 1 induce overfitting. Furthermore, real-life data evaluation on three benchmark datasets—Pima Indian Diabetes, Diabetic Retinopathy Debrecen, and Early Stage Diabetes Risk Prediction—revealed that the framework achieves state-of-the-art results, surpassing conventional (random forest, support vector machine) and recent hybrid frameworks with an accuracy of up to 100%, AUC of 99.1–100%, and superior calibration (Brier score: 0.006–0.023). Notably, the BiLSTM variant consistently outperforms unidirectional LSTM in the proposed framework, particularly in sensitivity (98.4% vs. 97.0% on retinopathy data), highlighting its strength in capturing temporal dependencies.

Keywords: recurrent neural network; long short-term memory; diabetes prediction; ensemble learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/4/628/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/4/628/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:4:p:628-:d:1591456

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().