Optimizing Credit Risk Prediction for Peer-to-Peer Lending Using Machine Learning

Souadda, Lyne Imene; Halitim, Ahmed Rami; Benilles, Billel; Oliveira, José Manuel; Ramos, Patrícia

Optimizing Credit Risk Prediction for Peer-to-Peer Lending Using Machine Learning

Lyne Imene Souadda, Ahmed Rami Halitim, Billel Benilles, José Manuel Oliveira () and Patrícia Ramos
Additional contact information
Lyne Imene Souadda: Applied Studies in Business and Management Sciences Laboratory, Finance Department, Higher School of Commerce, Kolea University Center, Kolea 42003, Tipaza, Algeria
Ahmed Rami Halitim: Statistics Department, National School of Statistics and Applied Economics, Kolea University Center, Kolea 42003, Tipaza, Algeria
Billel Benilles: Applied Studies in Business and Management Sciences Laboratory, Finance Department, Higher School of Commerce, Kolea University Center, Kolea 42003, Tipaza, Algeria
José Manuel Oliveira: Institute for Systems and Computer Engineering, Technology and Science, Campus da FEUP, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal
Patrícia Ramos: Institute for Systems and Computer Engineering, Technology and Science, Campus da FEUP, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal

Forecasting, 2025, vol. 7, issue 3, 1-31

Abstract: Hyperparameter optimization (HPO) is critical for enhancing the predictive performance of machine learning models in credit risk assessment for peer-to-peer (P2P) lending. This study evaluates four HPO methods, Grid Search, Random Search, Hyperopt, and Optuna, across four models, Logistic Regression, Random Forest, XGBoost, and LightGBM, using three real-world datasets (Lending Club, Australia, Taiwan). We assess predictive accuracy (AUC, Sensitivity, Specificity, G-Mean), computational efficiency, robustness, and interpretability. LightGBM achieves the highest AUC (e.g., 70.77 % on Lending Club, 93.25 % on Australia, 77.85 % on Taiwan), with XGBoost performing comparably. Bayesian methods (Hyperopt, Optuna) match or approach Grid Search’s accuracy while reducing runtime by up to 75.7 -fold (e.g., 3.19 vs. 241.47 min for LightGBM on Lending Club). A sensitivity analysis confirms robust hyperparameter configurations, with AUC variations typically below 0.4 % under ± 10 % perturbations. A feature importance analysis, using gain and SHAP metrics, identifies debt-to-income ratio and employment title as key default predictors, with stable rankings (Spearman correlation > 0.95 , p < 0.01 ) across tuning methods, enhancing model interpretability. Operational impact depends on data quality, scalable infrastructure, fairness audits for features like employment title, and stakeholder collaboration to ensure compliance with regulations like the EU AI Act and U.S. Equal Credit Opportunity Act. These findings advocate Bayesian HPO and ensemble models in P2P lending, offering scalable, transparent, and fair solutions for default prediction, with future research suggested to explore advanced resampling, cost-sensitive metrics, and feature interactions.

Keywords: credit risk; ensemble learning; hyperparameter optimization; peer-to-peer lending (search for similar items in EconPapers)
JEL-codes: A1 B4 C0 C1 C2 C3 C4 C5 C8 M0 Q2 Q3 Q4 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2571-9394/7/3/35/pdf (application/pdf)
https://www.mdpi.com/2571-9394/7/3/35/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jforec:v:7:y:2025:i:3:p:35-:d:1690341

Access Statistics for this article

Forecasting is currently edited by Ms. Joss Chen

More articles in Forecasting from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().