Random Generalized Additive Logistic Forest: A Novel Ensemble Method for Robust Binary Classification
Oyebayo Ridwan Olaniran,
Ali Rashash R. Alzahrani (),
Nada MohammedSaeed Alharbi and
Asma Ahmad Alzahrani
Additional contact information
Oyebayo Ridwan Olaniran: Department of Statistics, Faculty of Physical Sciences, University of Ilorin, Ilorin 1515, Nigeria
Ali Rashash R. Alzahrani: Mathematics Department, Faculty of Sciences, Umm Al-Qura University, Makkah 24382, Saudi Arabia
Nada MohammedSaeed Alharbi: Department of Mathematics, Faculty of Science, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia
Asma Ahmad Alzahrani: Department of Mathematics, Faculty of Science, Al-Baha University, Alaqiq, Al-Baha 65799, Saudi Arabia
Mathematics, 2025, vol. 13, issue 7, 1-25
Abstract:
Ensemble methods have proven highly effective in enhancing predictive performance by combining multiple models. We introduce a novel ensemble approach, the Random Generalized Additive Logistic Forest (RGALF), which integrates generalized additive models (GAMs) within a random forest framework to improve binary classification tasks. Unlike traditional random forests, which rely on piecewise constant predictions in terminal nodes, RGALF fits GAM logistic regression (LR) models to the data in each terminal node, enabling it to capture complex nonlinear relationships and interactions among predictors. By aggregating these node-specific GAMs, RGALF addresses multicollinearity, enhances interpretability, and achieves superior bias–variance tradeoffs, particularly in nonlinear settings. Theoretical analysis confirms that RGALF achieves Stone’s optimal rates for additive models ( O ( n − 2 k / ( 2 k + d ) ) under appropriate conditions, outperforming the slower convergence of traditional random forests ( O ( n − 2 / 3 ) ). Furthermore, empirical results demonstrate RGALF’s effectiveness across both simulated and real-world datasets. In simulations, RGALF demonstrates superior performance over random forests (RFs), reducing variance by up to 69% and bias by 19% in nonlinear settings, with significant MSE improvements (0.032 vs. RF’s 0.054 at n = 1000 ), while achieving optimal convergence rates ( O ( n − 0.48 ) vs. RF’s O ( n − 0.29 ) ). On real-world medical datasets, RGALF attains near-perfect accuracy and AUC: 100% accuracy/AUC for Heart Failure and Hepatitis C (HCV) prediction, 99% accuracy/100% AUC for Pima Diabetes, and 98.8% accuracy/100% AUC for Indian Liver Patient (ILPD), outperforming state-of-the-art methods. Notably, RGALF captures complex biomarker interactions (BMI–insulin in diabetes) missed by traditional models.
Keywords: generalized additive model (GAM); random forest (RF); logistic regression (LR); ensemble methods; binary classification; nonlinearity (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/7/1214/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/7/1214/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:7:p:1214-:d:1629768
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().