A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring
Fatemeh Nemati Koutanaei,
Hedieh Sajedi and
Mohammad Khanbabaei
Journal of Retailing and Consumer Services, 2015, vol. 27, issue C, 11-23
Abstract:
Data mining techniques have numerous applications in credit scoring of customers in the banking field. One of the most popular data mining techniques is the classification method. Previous researches have demonstrated that using the feature selection (FS) algorithms and ensemble classifiers can improve the banks' performance in credit scoring problems. In this domain, the main issue is the simultaneous and the hybrid utilization of several FS and ensemble learning classification algorithms with respect to their parameters setting, in order to achieve a higher performance in the proposed model. As a result, the present paper has developed a hybrid data mining model of feature selection and ensemble learning classification algorithms on the basis of three stages. The first stage, as expected, deals with the data gathering and pre-processing. In the second stage, four FS algorithms are employed, including principal component analysis (PCA), genetic algorithm (GA), information gain ratio, and relief attribute evaluation function. In here, parameters setting of FS methods is based on the classification accuracy resulted from the implementation of the support vector machine (SVM) classification algorithm. After choosing the appropriate model for each selected feature, they are applied to the base and ensemble classification algorithms. In this stage, the best FS algorithm with its parameters setting is indicated for the modeling stage of the proposed model. In the third stage, the classification algorithms are employed for the dataset prepared from each FS algorithm. The results exhibited that in the second stage, PCA algorithm is the best FS algorithm. In the third stage, the classification results showed that the artificial neural network (ANN) adaptive boosting (AdaBoost) method has higher classification accuracy. Ultimately, the paper verified and proposed the hybrid model as an operative and strong model for performing credit scoring.
Keywords: Credit scoring; Classification; Feature selection; Ensemble learning; Data mining (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (19)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0969698915300060
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:joreco:v:27:y:2015:i:c:p:11-23
DOI: 10.1016/j.jretconser.2015.07.003
Access Statistics for this article
Journal of Retailing and Consumer Services is currently edited by Harry Timmermans
More articles in Journal of Retailing and Consumer Services from Elsevier
Bibliographic data for series maintained by Catherine Liu ().