A novel credit model risk measure: Do more data lead to lower model risk?
Valter T. Yoshida,
Rafael Schiozer,
Alan de Genaro and
Toni R.E. dos Santos
The Quarterly Review of Economics and Finance, 2025, vol. 100, issue C
Abstract:
Large databases and Machine Learning enhance our capacity to develop models with many observations and explanatory variables. While the literature has primarily focused on optimizing classifications, little attention has been given to model risk, especially originating from inadequate use. To address this gap, we introduce a new metric for assessing model risk in credit applications. We test the metric using cross-section LASSO default models, each incorporating 200 thousand loan observations from several banks and more than 100 explanatory variables. The results indicate that models that use loans from a single bank have lower model risk than models using loans from the entire financial system. Therefore, adding loans from different banks to increase the number of observations in a model is suboptimal, challenging the widely accepted assumption that more data leads to better predictions.
Keywords: Model risk; Model selection; Credit risk; Credit scoring; Big data; Machine learning (search for similar items in EconPapers)
JEL-codes: C52 C55 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1062976925000018
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:quaeco:v:100:y:2025:i:c:s1062976925000018
DOI: 10.1016/j.qref.2025.101960
Access Statistics for this article
The Quarterly Review of Economics and Finance is currently edited by R. J. Arnould and J. E. Finnerty
More articles in The Quarterly Review of Economics and Finance from Elsevier
Bibliographic data for series maintained by Catherine Liu ().