EconPapers    
Economics at your fingertips  
 

Is COVID-19 reflected in AnaCredit dataset? A big data - machine learning approach for analysing behavioural patterns using loan level granular information

Anastasios Petropoulos (), Evangelos Stavroulakis, Panagiotis Lazaris, Vasilis Siakoulis and Nikolaos Vlachogiannakis
Additional contact information
Anastasios Petropoulos: Bank of Greece
Evangelos Stavroulakis: Bank of Greece
Panagiotis Lazaris: Bank of Greece
Vasilis Siakoulis: Bank of Greece
Nikolaos Vlachogiannakis: Bank of Greece

No 315, Working Papers from Bank of Greece

Abstract: In this study, we explore the impact of COVID-19 pandemic on the default risk of loan portfolios of the Greek banking system, using cutting edge machine learning technologies, like deep learning. Our analysis is based on loan level monthly data, spanning a 42-month period, collected through the ECB AnaCredit database. Our dataset contains more than three million records, including both the pre- and post-pandemic periods. We develop a series of credit rating models implementing state of the art machine learning algorithms. Through an extensive validation process, we explore the best machine learning technique to build a behavioral credit scoring model and subsequently we investigate the estimated sensitivities of various features on predicting default risk. To select the best candidate model, we perform comparisons of the classification accuracy of the proposed methods, in 2-months out-of-time period. Our empirical results indicate that the Deep Neural Networks (DNN) have a superior predictive performance, signalling better generalization capacity against Random Forests, Extreme Gradient Boosting (XGBoost), and logistic regression. The proposed DNN model can accurately simulate the non-linearities caused by the pandemic outbreak on the evolution of default rates for Greek corporate customers. Under this multivariate setup we apply interpretability algorithms to isolate the impact of COVID-19 on the probability of default, controlling for the rest of the features of the DNN. Our results indicate that the impact of the pandemic peaks in the first year, and then it slowly decreases, though without reaching yet the pre COVID-19 levels. Furthermore, our empirical results also suggest different behavioral patterns between Stage 1 and Stage 2 loans, and that default rate sensitivities vary significantly across sectors. The current empirical work can facilitate a more in-depth analysis of AnaCredit database, by providing robust statistical tools for a more effective and responsive micro and macro supervision of credit risk.

Keywords: Credit Risk; Deep Learning; AnaCredit; COVID-19 (search for similar items in EconPapers)
JEL-codes: C38 C45 C55 G24 (search for similar items in EconPapers)
Pages: 56
Date: 2023-03
New Economics Papers: this item is included in nep-ban, nep-big, nep-cmp, nep-des and nep-mac
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.52903/wp2023315 Full Text (application/pdf)
Our link check indicates that this URL is bad, the error code is: 403 Forbidden (https://doi.org/10.52903/wp2023315 [302 Found]--> https://www.bankofgreece.gr/Publications/Paper2023315.pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bog:wpaper:315

Access Statistics for this paper

More papers in Working Papers from Bank of Greece Contact information at EDIRC.
Bibliographic data for series maintained by Anastasios Rizos ().

 
Page updated 2025-03-30
Handle: RePEc:bog:wpaper:315