Quality checks on granular banking data: an experimental approach based on machine learning?
Fabio Zambuto (),
Maria Rosaria Buzzi (),
Giuseppe Costanzo (),
Marco Di Lucido (),
Barbara La Ganga (),
Pasquale Maddaloni (),
Fabio Papale () and
Emiliano Svezia ()
Additional contact information
Fabio Zambuto: Bank of Italy
Maria Rosaria Buzzi: Bank of Italy
Giuseppe Costanzo: Bank of Italy
Marco Di Lucido: Bank of Italy
Barbara La Ganga: Bank of Italy
Pasquale Maddaloni: Bank of Italy
Fabio Papale: Bank of Italy
Emiliano Svezia: Bank of Italy
No 547, Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area
We propose a new methodology, based on machine learning algorithms, for the automatic detection of outliers in the data that banks report to the Bank of Italy. Our analysis focuses on granular data gathered within the statistical data collection on payment services, in which the lack of strong ex ante deterministic relationships among the collected variables makes standard diagnostic approaches less powerful. Quantile regression forests are used to derive a region of acceptance for the targeted information. For a given level of probability, plausibility thresholds are obtained on the basis of individual bank characteristics and are automatically updated as new data are reported. The approach was applied to validate semi-annual data on debit card issuance received from reporting agents between December 2016 and June 2018. The algorithm was trained with data reported in previous periods and tested by cross-checking the identified outliers with the reporting agents. The method made it possible to detect, with a high level of precision in term of false positives, new outliers that had not been detected using the standard procedures.
Keywords: banking data; data quality management; outlier detection; machine learning; quantile regression; random forests (search for similar items in EconPapers)
JEL-codes: C18 C81 G21 (search for similar items in EconPapers)
New Economics Papers: this item is included in nep-big, nep-cmp and nep-pay
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:bdi:opques:qef_547_20
Access Statistics for this paper
More papers in Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area Contact information at EDIRC.
Bibliographic data for series maintained by ().