A novel framework for enhancing transparency in credit scoring: Leveraging Shapley values for interpretable credit scorecards
Rivalani Hlongwane,
Kutlwano Ramabao and
Wilson Mongwe
PLOS ONE, 2024, vol. 19, issue 8, 1-20
Abstract:
Credit scorecards are essential tools for banks to assess the creditworthiness of loan applicants. While advanced machine learning models like XGBoost and random forest often outperform traditional logistic regression in predictive accuracy, their lack of interpretability hinders their adoption in practice. This study bridges the gap between research and practice by developing a novel framework for constructing interpretable credit scorecards using Shapley values. We apply this framework to two credit datasets, discretizing numerical variables and utilizing one-hot encoding to facilitate model development. Shapley values are then employed to derive credit scores for each predictor variable group in XGBoost, random forest, LightGBM, and CatBoost models. Our results demonstrate that this approach yields credit scorecards with interpretability comparable to logistic regression while maintaining superior predictive accuracy. This framework offers a practical and effective solution for credit practitioners seeking to leverage the power of advanced models without sacrificing transparency and regulatory compliance.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0308718 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 08718&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0308718
DOI: 10.1371/journal.pone.0308718
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().