EconPapers    
Economics at your fingertips  
 

Optimizing the Collection Process in Credit Risk Management: A Comparison of Machine Learning Techniques for Predicting Payment Probability at Different Stages of Arrears

Andrés Carrera () and Marco E. Benalcázar
Additional contact information
Andrés Carrera: Artificial Intelligence and Computer Vision Research Laboratory, Departamento de Informática y Ciencias de la Computación, Escuela Politécnica Nacional, Quito 170150, Ecuador
Marco E. Benalcázar: Artificial Intelligence and Computer Vision Research Laboratory, Departamento de Informática y Ciencias de la Computación, Escuela Politécnica Nacional, Quito 170150, Ecuador

JRFM, 2025, vol. 18, issue 11, 1-21

Abstract: In credit risk, scoring models based on logistic regression have been developed to optimize the default risk assessment. However, these models require complex feature engineering, and their accuracy worsens as the arrears progresses. This study proposes the use of machine learning techniques (XGBoost and artificial neural networks) to generate scores in different arrears segments (No Arrears Segment, 1–30 Days of Arrears Segment, 31–90 Days of Arrears Segment, and All Segments). The Kolmogorov–Smirnov (KS) metric is used to assess the efficiency and predictive power of the models. To ensure the accuracy and reliability of the models, a five-step methodology is employed. It starts with the formulation of the problem, followed by the selection of a data sample and definition of the target variable, then a descriptive analysis of the data is performed to facilitate the data cleaning. Subsequently, the models are trained and tested, and finally, the results are analyzed, and the models obtained are interpreted. The results show that both XGBoost and artificial neural network models outperform logistic regression in most of the arrears segments. In the No Arrears Segment, the XGBoost model is the best with KS = 63.36%. In the 1–30 Segment, XGBoost is also the best with KS = 51.38%. In the 31–90 Segment, the artificial neural network model is the best with KS = 38.77%. Finally, with all segments of arrears, the XGBoost model is again the best with KS = 74.05%.

Keywords: XGboost; artificial neural networks; logistic regression; credit scoring; credit risk management (search for similar items in EconPapers)
JEL-codes: C E F2 F3 G (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1911-8074/18/11/630/pdf (application/pdf)
https://www.mdpi.com/1911-8074/18/11/630/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jjrfmx:v:18:y:2025:i:11:p:630-:d:1791039

Access Statistics for this article

JRFM is currently edited by Ms. Chelthy Cheng

More articles in JRFM from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-11-11
Handle: RePEc:gam:jjrfmx:v:18:y:2025:i:11:p:630-:d:1791039