A Comparative Study on Loan Default Classification with Imbalanced Data Processing
Ziang Wang ()
Additional contact information
Ziang Wang: McMaster University, Department of Mathematics & Statistics
A chapter in Proceedings of the 2025 International Conference on Hybrid Commerce, Human Capital, and Economic Dynamics (ICHCH 2025), 2026, pp 280-287 from Springer
Abstract:
Abstract Credit risk default classification is a cornerstone of modern financial risk management, enabling institutions to optimize lending, allocate capital efficiently, and mitigate losses, with accurate predictions directly impacting financial system stability amid economic volatility. A critical challenge is data imbalance: default samples typically make up only 5–15% of datasets, biasing models toward the majority class and harming recall, the key metric for minimizing losses. This study compares four models (Logistic Regression, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Random Forest) combined with four imbalance-handling methods, using Accuracy, Recall, and F1 score as metrics. Results show tree-based models outperform Logistic Regression across metrics. For Logistic Regression, class weighting effectively improves recall; for tree-based models, class weighting boosts recall but slightly reduces F1, while Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) enhance F1 but risk noise. These findings highlight optimal strategies, with future work needed on ensemble methods and interpretability to refine credit risk assessment.
Keywords: Weight processing; Logistic Regression; Tree-based Models (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advbcp:978-2-38476-585-0_33
Ordering information: This item can be ordered from
http://www.springer.com/9782384765850
DOI: 10.2991/978-2-38476-585-0_33
Access Statistics for this chapter
More chapters in Advances in Economics, Business and Management Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().