Resampling Techniques Study on Class Imbalance Problem in Credit Risk Prediction
Zixue Zhao,
Tianxiang Cui (),
Shusheng Ding,
Jiawei Li and
Anthony Graham Bellotti
Additional contact information
Zixue Zhao: School of Statistics and Mathematics, Yunnan University of Finance and Economics, No. 237, LongQuan Rd., Kunming 650221, China
Tianxiang Cui: School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China
Shusheng Ding: School of Business, Ningbo University, 818 Fenghua Road Ningbo, Ningbo 315211, China
Jiawei Li: School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China
Anthony Graham Bellotti: School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China
Mathematics, 2024, vol. 12, issue 5, 1-27
Abstract:
Credit risk prediction heavily relies on historical data provided by financial institutions. The goal is to identify commonalities among defaulting users based on existing information. However, data on defaulters is often limited, leading to a concentration of credit data where positive samples (defaults) are significantly fewer than negative samples (nondefaults). It poses a serious challenge known as the class imbalance problem, which can substantially impact data quality and predictive model effectiveness. To address the problem, various resampling techniques have been proposed and studied extensively. However, despite ongoing research, there is no consensus on the most effective technique. The choice of resampling technique is closely related to the dataset size and imbalance ratio, and its effectiveness varies across different classifiers. Moreover, there is a notable gap in research concerning suitable techniques for extremely imbalanced datasets. Therefore, this study aims to compare popular resampling techniques across different datasets and classifiers while also proposing a novel hybrid sampling method tailored for extremely imbalanced datasets. Our experimental results demonstrate that this new technique significantly enhances classifier predictive performance, shedding light on effective strategies for managing the class imbalance problem in credit risk prediction.
Keywords: credit risk prediction; resampling; class imbalance (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/5/701/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/5/701/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:5:p:701-:d:1347551
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().