Loan Default Prediction and Feature Importance Analysis Based on the XGBoost Model
Ruoyu Qi
European Journal of Business, Economics & Management, 2025, vol. 1, issue 2, 141-149
Abstract:
Loan default prediction is a critical task in financial risk management. Traditional statistical models often struggle to handle large-scale, nonlinear, and high-dimensional financial data. In this study, we explore the application of the eXtreme Gradient Boosting (XGBoost) model for predicting loan defaults using a publicly available dataset from Kaggle. The paper simulates a complete analytical pipeline, including data preprocessing, model training, evaluation, and feature importance analysis. Simulated results demonstrate that XGBoost can achieve high predictive accuracy and robust ability to distinguish between defaulters and non-defaulters. Furthermore, feature importance analysis reveals that variables such as revolving credit utilization, borrower age, and past due history play crucial roles in determining default risk. This research highlights the effectiveness and interpretability of using XGBoost in financial decision-making scenarios.
Keywords: loan default prediction; XGBoost; machine learning; feature importance; credit scoring; financial risk modeling (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://pinnaclepubs.com/index.php/EJBEM/article/view/168/179 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:dba:ejbema:v:1:y:2025:i:2:p:141-149
Access Statistics for this article
More articles in European Journal of Business, Economics & Management from Pinnacle Academic Press
Bibliographic data for series maintained by Joseph Clark ().