EconPapers    
Economics at your fingertips  
 

The impact of class imbalance in logistic regression models for low-default portfolios in credit risk

Willem D. Schutte, Charl Pretorius, Neill Smit, Leandra van der Merwe and Robert Maxwell

Papers from arXiv.org

Abstract: In this paper, we study how class imbalance, typical of low-default credit portfolios, affects the performance of logistic regression models. Using a simulation study with controlled data-generating mechanisms, we vary (i) the level of class imbalance and (ii) the strength of association between the predictors and the response. The results show that, for a given strength of association, achievable classification accuracy deteriorates markedly as the event rate decreases, and the optimal classification cut-off shifts with the level of imbalance. In contrast, the Gini coefficient is comparatively stable with respect to class imbalance once sample sizes are sufficiently large, even when classification accuracy is strongly affected. As a practical guideline, we summarise attainable classification performance as a function of the event rate and strength of association between the predictors and the response.

Date: 2026-02
References: Add references at CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2602.19663 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2602.19663

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2026-02-24
Handle: RePEc:arx:papers:2602.19663