A Two-Stage Machine Learning Approach to Bankruptcy Prediction: Integrating Full-Feature Modeling and Optimized Feature Selection
Masanobu Matsumaru () and
Hideki Katagiri
Additional contact information
Masanobu Matsumaru: Research Institute for Engineering, Kanagawa University, Yokohama 221-8686, Japan
Hideki Katagiri: Department of Industrial Engineering and Management, Kanagawa University, Yokohama 221-8686, Japan
JRFM, 2025, vol. 18, issue 12, 1-21
Abstract:
Corporate bankruptcy prediction has become increasingly critical amid economic uncertainty. This study proposes a novel two-stage machine learning approach to enhance bankruptcy prediction accuracy, applied to Tokyo Stock Exchange-listed companies. First, models were trained using 173 financial indicators. Second, a wrapper-based feature selection process was employed to reduce dimensionality and eliminate noise, thereby identifying an optimal seven-feature set. Two ensemble learning methods, Random Forest and Light Gradient Boosting Machine (LightGBM), were used. Random Forest correctly predicted 566 bankruptcies using the reduced feature set (88 more than when using all features) compared with 451 by LightGBM (31 more than when using all features). LightGBM is a gradient boosting–based ensemble learning method that employs a leaf-wise tree growth strategy, enabling fast computation and high predictive accuracy, especially in large-scale and high-dimensional datasets. The study also addresses challenges posed by imbalanced data by employing resampling techniques (SMOTE, SMOTE-ENN, and KMeans). Additionally, the need for industry-specific modeling is recognized by constructing models for the six industry sectors. These findings highlight the importance of feature selection and ensemble learning for improving model generalizability and uncovering industry-specific patterns. This study contributes to the field of bankruptcy prediction by providing a robust framework for accurate and interpretable predictions for both academic research and practical applications. Future work will focus on further enhancing prediction accuracy to identify more potential bankruptcies.
Keywords: corporate bankruptcy prediction; feature selection; ensemble learning; Random Forest; LightGBM; imbalanced data; Tokyo Stock Exchange (search for similar items in EconPapers)
JEL-codes: C E F2 F3 G (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1911-8074/18/12/662/pdf (application/pdf)
https://www.mdpi.com/1911-8074/18/12/662/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jjrfmx:v:18:y:2025:i:12:p:662-:d:1800867
Access Statistics for this article
JRFM is currently edited by Ms. Chelthy Cheng
More articles in JRFM from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().