Random Generalized Additive Logistic Forest: A Novel Ensemble Method for Robust Binary Classification

Olaniran, Oyebayo Ridwan; Alzahrani, Ali Rashash R.; Alharbi, Nada MohammedSaeed; Alzahrani, Asma Ahmad

Random Generalized Additive Logistic Forest: A Novel Ensemble Method for Robust Binary Classification

Oyebayo Ridwan Olaniran, Ali Rashash R. Alzahrani (), Nada MohammedSaeed Alharbi and Asma Ahmad Alzahrani
Additional contact information
Oyebayo Ridwan Olaniran: Department of Statistics, Faculty of Physical Sciences, University of Ilorin, Ilorin 1515, Nigeria
Ali Rashash R. Alzahrani: Mathematics Department, Faculty of Sciences, Umm Al-Qura University, Makkah 24382, Saudi Arabia
Nada MohammedSaeed Alharbi: Department of Mathematics, Faculty of Science, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia
Asma Ahmad Alzahrani: Department of Mathematics, Faculty of Science, Al-Baha University, Alaqiq, Al-Baha 65799, Saudi Arabia

Mathematics, 2025, vol. 13, issue 7, 1-25

Abstract: Ensemble methods have proven highly effective in enhancing predictive performance by combining multiple models. We introduce a novel ensemble approach, the Random Generalized Additive Logistic Forest (RGALF), which integrates generalized additive models (GAMs) within a random forest framework to improve binary classification tasks. Unlike traditional random forests, which rely on piecewise constant predictions in terminal nodes, RGALF fits GAM logistic regression (LR) models to the data in each terminal node, enabling it to capture complex nonlinear relationships and interactions among predictors. By aggregating these node-specific GAMs, RGALF addresses multicollinearity, enhances interpretability, and achieves superior bias–variance tradeoffs, particularly in nonlinear settings. Theoretical analysis confirms that RGALF achieves Stone’s optimal rates for additive models ( O ( n − 2 k / ( 2 k + d ) ) under appropriate conditions, outperforming the slower convergence of traditional random forests ( O ( n − 2 / 3 ) ). Furthermore, empirical results demonstrate RGALF’s effectiveness across both simulated and real-world datasets. In simulations, RGALF demonstrates superior performance over random forests (RFs), reducing variance by up to 69% and bias by 19% in nonlinear settings, with significant MSE improvements (0.032 vs. RF’s 0.054 at n = 1000 ), while achieving optimal convergence rates ( O ( n − 0.48 ) vs. RF’s O ( n − 0.29 ) ). On real-world medical datasets, RGALF attains near-perfect accuracy and AUC: 100% accuracy/AUC for Heart Failure and Hepatitis C (HCV) prediction, 99% accuracy/100% AUC for Pima Diabetes, and 98.8% accuracy/100% AUC for Indian Liver Patient (ILPD), outperforming state-of-the-art methods. Notably, RGALF captures complex biomarker interactions (BMI–insulin in diabetes) missed by traditional models.

Keywords: generalized additive model (GAM); random forest (RF); logistic regression (LR); ensemble methods; binary classification; nonlinearity (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/7/1214/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/7/1214/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:7:p:1214-:d:1629768

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().