The Impact of Feature Selection and Transformation on Machine Learning Methods in Determining the Credit Scoring
Oguz Koc,
Omur Ugur and
A. Sevtap Kestel
Papers from arXiv.org
Abstract:
Banks utilize credit scoring as an important indicator of financial strength and eligibility for credit. Scoring models aim to assign statistical odds or probabilities for predicting if there is a risk of nonpayment in relation to many other factors which may be involved in. This paper aims to illustrate the beneficial use of the eight machine learning (ML) methods (Support Vector Machine, Gaussian Naive Bayes, Decision Trees, Random Forest, XGBoost, K-Nearest Neighbors, Multi-layer Perceptron Neural Networks) and Logistic Regression in finding the default risk as well as the features contributing to it. An extensive comparison is made in three aspects: (i) which ML models with and without its own wrapper feature selection performs the best; (ii) how feature selection combined with appropriate data scaling method influences the performance; (iii) which of the most successful combination (algorithm, feature selection, and scaling) delivers the best validation indicators such as accuracy rate, Type I and II errors and AUC. An open-access credit scoring default risk data sets on German and Australian cases are taken into account, for which we determine the best method, scaling, and features contributing to default risk best and compare our findings with the literature ones in related. We illustrate the positive contribution of the selection method and scaling on the performance indicators compared to the existing literature.
Date: 2023-03
New Economics Papers: this item is included in nep-ban, nep-big, nep-cmp, nep-des and nep-rmg
References: View references in EconPapers View complete reference list from CitEc
Citations:
Published in Oguz Koc. Comparison of Machine Learning Algorithms on Consumer Credit Classification. M.Sc. Thesis, Middle East Technical University, 2019
Downloads: (external link)
http://arxiv.org/pdf/2303.05427 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2303.05427
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().