EconPapers    
Economics at your fingertips  
 

Using Machine Learning to Model Bankruptcy Risk in Listed Companies

Vlad Teodorescu and Catalina-Ioana Toader
Additional contact information
Vlad Teodorescu: Bucharest University of Economic Studies, Bucharest, Romania
Catalina-Ioana Toader: Bucharest University of Economic Studies, Bucharest, Romania

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ECONOMICS AND SOCIAL SCIENCES, 2024, vol. 6, issue 1, 610-619

Abstract: This article extensively studies the optimisation and relative performance of three classes of machine learning models (logistic regression with regularisation, Random Forest, and XGBoost) to quantify the probability of bankruptcy using financial data from a database of listed companies in Taiwan. The database covers the period from 1999 to 2009, contains 95 financial ratios from 7 categories, has 6,819 observations, and has a bankruptcy rate of approximately 3.2%. The database choice stemmed from our wish of utilising a dataset which was publicly available and that posed high quality and moderate size, traits that permitted the rapid training of machine learning models. As aresult, we were able to run experiments based on multiple model configurations and to compare the attained results with the ones gathered by other researchers. For the purpose of splitting data for training and testing sets, the k-fold cross-validation methodology can be used. We investigate the validity of its use, especially in the context of XGBoost with an early stopping round based on the test fold. We also determine the sensitivity of predictive performance on the value of k and on the specific folds created. We use AUROC as a performance measure and show that Random Forest models significantly outperform logistic models with regularisation, while XGBoost models have a moderately higher performance than Random Forest. For each type of model, we study hyperparameter tuning and demonstrate that this process has a significant effect on predictive performance. For the first two types of model, we perform a full grid search. For XGBoost models, we use a guided (sequential) grid search methodology. Furthermore, we study and propose a criterion for hyperparameter tuning using average performance instead of maximum performance, highlighting the relatively large effect on predictive performance of the stochastic component employed by these machine learning algorithms during training. Our research also indicates that in the case of some hyperparameters, tuning can shape predictive performance. Last but not least, the meaningfulness of variables in forecasting the bankruptcy likelihood is assessed, as it was indicated by the three classes of models.

Keywords: bankruptcy risk; probability of bankruptcy; machine learning; xgboost; random forest. (search for similar items in EconPapers)
JEL-codes: C53 C55 D81 G2 G32 (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.icess.ase.ro/using-machine-learning-to ... in-listed-companies/ (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:rom:conase:v:6:y:2024:i:1:p:610-619

Access Statistics for this article

More articles in PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ECONOMICS AND SOCIAL SCIENCES from Bucharest University of Economic Studies, Romania Contact information at EDIRC.
Bibliographic data for series maintained by Zamfir Andreea ().

 
Page updated 2025-03-19
Handle: RePEc:rom:conase:v:6:y:2024:i:1:p:610-619