EconPapers    
Economics at your fingertips  
 

An explainable SSA-CatBoost machine learning model and application in corporate credit rating: evidence from China

Ruicheng Yang (), Pucong Wang (), Li Li () and Sheng Yong ()
Additional contact information
Ruicheng Yang: Inner Mongolia University of Finance and Economics
Pucong Wang: Inner Mongolia University of Finance and Economics
Li Li: Marist College
Sheng Yong: Inner Mongolia University of Finance and Economics

Annals of Operations Research, 2025, vol. 354, issue 1, No 10, 273-307

Abstract: Abstract This paper investigates feature selection and parameter optimization of machine learning models to enhance classification and prediction performance, with specific focus on the CatBoost model within the realm of corporate credit rating. We propose a novel hybrid model called SSA-CatBoost, which employs the Sparrow Search Algorithm (SSA) to optimize parameters, significantly improving the prediction accuracy of CatBoost. Moreover, to address the inherent black box nature of machine learning models, the paper utilizes the SHAP tool to explicate how the model determines corporate credit ratings based on diverse features. Based upon the data extracted from the annual financial reports of Chinese corporates from 2017 to May 2024, the empirical findings uncover several key insights: (a) The SSA-CatBoost model achieves superior evaluation metrics, with an accuracy of 0.9911, precision of 0.9983, recall of 0.9983, F1-score of 0.9933, and AUC value of 0.9943, surpassing other models such as CatBoost, RF, LightGBM, XGBoost, SVM, and LSTM, as well as their respective SSA-optimized versions. (b) SSA can effectively search for the optimal parameters of machine learning models, thereby improving their predictive performance; (c) The necessity of parameter optimization for the CatBoost model is explained; (d) The SHAP method is employed to demonstrate a positive correlation between corporate credit rating and SHAP value, that is, the higher the SHAP value of a corporation, the greater the likelihood that it will be classified as a high credit-rated corporation by the machine learning model.

Keywords: Corporate credit rating; SSA-CatBoost; Parameter tuning; Explainability; Feature selection (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10479-025-06513-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:354:y:2025:i:1:d:10.1007_s10479-025-06513-y

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479

DOI: 10.1007/s10479-025-06513-y

Access Statistics for this article

Annals of Operations Research is currently edited by Endre Boros

More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-11-05
Handle: RePEc:spr:annopr:v:354:y:2025:i:1:d:10.1007_s10479-025-06513-y