A comparative analysis of machine learning algorithms for predicting probabilities of default
Adrian Iulian Cristescu and
Matteo Giordano
Papers from arXiv.org
Abstract:
Predicting the probability of default (PD) of prospective loans is a critical objective for financial institutions. In recent years, machine learning (ML) algorithms have achieved remarkable success across a wide variety of prediction tasks; yet, they remain relatively underutilised in credit risk analysis. This paper highlights the opportunities that ML algorithms offer to this field by comparing the performance of five predictive models-Random Forests, Decision Trees, XGBoost, Gradient Boosting and AdaBoost-to the predominantly used logistic regression, over a benchmark dataset from Scheule et al. (Credit Risk Analytics: The R Companion). Our findings underscore the strengths and weaknesses of each method, providing valuable insights into the most effective ML algorithms for PD prediction in the context of loan portfolios.
Date: 2025-06
References: Add references at CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2506.19789 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2506.19789
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().