An implementation of ensemble methods, logistic regression, and neural network for default prediction in Peer-to-Peer lending
Aneta Dzik-Walczak () and
Mateusz Heba ()
Additional contact information
Aneta Dzik-Walczak: University of Warsaw – Faculty of Economic Sciences, D³uga 44/50, 00-241 Warsaw, Poland
Mateusz Heba: University of Warsaw – Faculty of Economic Sciences, D³uga 44/50, 00-241 Warsaw, Poland
Zbornik radova Ekonomskog fakulteta u Rijeci/Proceedings of Rijeka Faculty of Economics, 2021, vol. 39, issue 1, 163-197
Abstract:
Credit scoring has become an important issue because competition among financial institutions is intense and even a small improvement in predictive accuracy can result in significant savings. Financial institutions are looking for optimal strategies using credit scoring models. Therefore, credit scoring tools are extensively studied. As a result, various parametric statistical methods, non-parametric statistical tools and soft computing approaches have been developed to improve the accuracy of credit scoring models. In this paper, different approaches are used to classify customers into those who repay the loan and those who default on a loan. The purpose of this study is to investigate the performance of two credit scoring techniques, the logistic regression model estimated on categorized variables modified with the use of WOE (Weight of Evidence) transformation, and neural networks. We also combine multiple classifiers and test whether ensemble learning has better performance. To evaluate the feasibility and effectiveness of these methods, the analysis is performed on Lending Club data. In addition, we investigate Peer-to-peer lending, also called social lending. From the results, it can be concluded that the logistic regression model can provide better performance than neural networks. The proposed ensemble model (a combination of logistic regression and neural network by averaging the probabilities obtained from both models) has higher AUC, Gini coefficient and Kolmogorov-Smirnov statistics compared to other models. Therefore, we can conclude that the ensemble model allows to successfully reduce the potential risks of losses due to misclassification costs.
Keywords: credit scoring; ensemble methods; logistic regression; neural nets; peer-to-peer lending (search for similar items in EconPapers)
JEL-codes: G21 G32 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.efri.uniri.hr/upload/Zbornik%201_2021/ ... zak_et_al-2021-1.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:rfe:zbefri:v:39:y:2021:i:1:p:163-197
Access Statistics for this article
More articles in Zbornik radova Ekonomskog fakulteta u Rijeci/Proceedings of Rijeka Faculty of Economics from University of Rijeka, Faculty of Economics and Business Contact information at EDIRC.
Bibliographic data for series maintained by Danijela Ujcic ().