Balanced Hoeffding Tree Forest (BHTF): A Novel Multi-Label Classification with Oversampling and Undersampling Techniques for Failure Mode Diagnosis in Predictive Maintenance

Ghasemkhani, Bita; Kut, Recep Alp; Birant, Derya; Yilmaz, Reyat

Balanced Hoeffding Tree Forest (BHTF): A Novel Multi-Label Classification with Oversampling and Undersampling Techniques for Failure Mode Diagnosis in Predictive Maintenance

Bita Ghasemkhani (), Recep Alp Kut, Derya Birant and Reyat Yilmaz
Additional contact information
Bita Ghasemkhani: Graduate School of Natural and Applied Sciences, Dokuz Eylul University, Izmir 35390, Turkey
Recep Alp Kut: Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
Derya Birant: Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
Reyat Yilmaz: Department of Electrical and Electronics Engineering, Dokuz Eylul University, Izmir 35390, Turkey

Mathematics, 2025, vol. 13, issue 18, 1-45

Abstract: Predictive maintenance (PdM) is essential for reducing equipment downtime and enhancing operational efficiency. However, PdM datasets frequently suffer from significant class imbalance and are often limited to single-label classification, which fails to reflect the complexity of real-world industrial systems where multiple failure modes can occur simultaneously. As the main contribution, we propose the Balanced Hoeffding Tree Forest (BHTF)—a novel multi-label classification framework that combines oversampling and undersampling strategies to effectively mitigate data imbalance. BHTF leverages the binary relevance method to decompose the multi-label problem into multiple binary tasks and utilizes an ensemble of Hoeffding Trees to ensure scalability and adaptability to streaming data. In particular, BHTF unifies three learning paradigms—multi-label learning (MLL), ensemble learning (EL), and incremental learning (IL)—providing a comprehensive and scalable approach for predictive maintenance applications. The key contribution of the proposed method is that it incorporates a hybrid data preprocessing strategy, introducing a novel undersampling technique, named Proximity-Driven Undersampling (PDU), and combining it with the Synthetic Minority Oversampling Technique (SMOTE) to effectively deal with the class imbalance issue in highly skewed datasets. Experimental results on the benchmark AI4I 2020 dataset showed that BHTF achieved an average classification accuracy of 97.44%, outperformed by a margin of the state-of-the-art methods (88.94%) with an improvement of 11% on average. These findings highlight the potential of BHTF as a robust artificial intelligence-based solution for complex fault detection in manufacturing predictive maintenance applications.

Keywords: machine learning; predictive maintenance; multi-label classification; ensemble learning; incremental learning; data imbalance; fault detection (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/18/3019/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/18/3019/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:18:p:3019-:d:1752365

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().