A method of classifying imbalanced credit data based on the AC-CTGAN hybrid sampling algorithm
Tinggui Chen,
Hailian Gu,
Zhiyu Yang,
Jianjun Yang and
Bing Wang
Journal of Credit Risk
Abstract:
The rapid growth of consumer credit services has heightened financial institutions’ need for enhanced risk management capabilities, as they strive to satisfy individuals’ various consumption preferences. Identifying personal credit risk is crucial in financial risk management, underscoring the importance of financial institutions developing a systematic and effective credit risk identification framework to mitigate the likelihood of credit defaults. To address the class imbalance of credit data, this paper starts at the data level and proposes the method of adaptive cluster mixed sampling based on conditional tabular generative adversarial networks (AC-CTGAN). The method first uses the edited nearest neighbors algorithm (ENN) for preliminary denoising of the original credit data, then employs the improved K-means algorithm to obtain multiple subclusters of the minority samples. The local density of each subcluster is calculated, and the oversampling weight of each subcluster is adaptively determined on the basis of the size of the local density. Finally, minority samples are generated via the CTGAN, and the decision boundary is clarified via the TomekLink algorithm. Comparative experimental results show that the minority class samples generated by the AC-CTGAN algorithm can realistically reflect the distribution of the original data, minimize the appearance of class-overlapping and limit the introduction of new noisy data, which increases sample diversity. The potential within-class imbalance of credit data is also somewhat alleviated. The risk-identification models trained on credit data processed by the AC-CTGAN algorithm have a greater generalization ability compared with the synthetic minority oversampling technique (SMOTE), SMOTE variants and the original CTGAN.
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.risk.net/journal-of-credit-risk/796042 ... d-sampling-algorithm (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:rsk:journ1:7960421
Access Statistics for this article
More articles in Journal of Credit Risk from Journal of Credit Risk
Bibliographic data for series maintained by Thomas Paine ().