Novel multilayer stacking framework with weighted ensemble approach for multiclass credit scoring problem application
Marek Stelmach () and
Marcin Chlebus ()
Additional contact information
Marek Stelmach: Faculty of Economic Sciences, University of Warsaw
No 2020-08, Working Papers from Faculty of Economic Sciences, University of Warsaw
Abstract:
Stacked ensembles approaches have been recently gaining importance in complex predictive problems where extraordinary performance is desirable. In this paper we develop a multilayer stacking framework and apply it to a large dataset related to credit scoring with multiple, imbalanced classes. Diverse base estimators (among others, bagged and boosted tree algorithms, regularized logistic regression, neural networks, Naive Bayes classifier) are examined and we propose three meta learners to be finally combined into a novel, weighted ensemble. To prevent bias in meta features construction, we introduce a nested cross-validation schema into the architecture, while weighted log loss evaluation metric is used to overcome training bias towards the majority class. Additional emphasis is placed on a proper data preprocessing steps and Bayesian optimization for hyperparameter tuning to ensure that the solution do not overfits. Our study indicates better stacking results compared to all individual base classifiers, yet we stress the importance of an assessment whether the improvement compensates increased computational time and design complexity. Furthermore, conducted analysis shows extremely good performance among bagged and boosted trees, both in base and meta learning phase. We conclude with a thesis that a weighted meta ensemble with regularization properties reveals the least overfitting tendencies.
Keywords: stacked ensembles; nested cross-validation; Bayesian optimization; multiclass problem; imbalanced classes (search for similar items in EconPapers)
JEL-codes: C38 C51 C52 C55 G32 (search for similar items in EconPapers)
Pages: 33 pages
Date: 2020
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ecm and nep-ore
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.wne.uw.edu.pl/index.php/download_file/5525/ First version, 2020 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:war:wpaper:2020-08
Access Statistics for this paper
More papers in Working Papers from Faculty of Economic Sciences, University of Warsaw Contact information at EDIRC.
Bibliographic data for series maintained by Marcin Bąba ().