EconPapers    
Economics at your fingertips  
 

A methodology for automised outlier detection in high-dimensional datasets: an application to euro area banks' supervisory data

Matteo Farnè and Angelos Vouldis

No 2171, Working Paper Series from European Central Bank

Abstract: Outlier detection in high-dimensional datasets poses new challenges that have not been investigated in the literature. In this paper, we present an integrated methodology for the identification of outliers which is suitable for datasets with higher number of variables than observations. Our method aims to utilise the entire relevant information present in a dataset to detect outliers in an automatized way, a feature that renders the method suitable for application in large dimensional datasets. Our proposed five-step procedure for regression outlier detection entails a robust selection stage of the most explicative variables, the estimation of a robust regression model based on the selected variables, and a criterion to identify outliers based on robust measures of the residuals' dispersion. The proposed procedure deals also with data redundancy and missing observations which may inhibit the statistical processing of the data due to the ill-conditioning of the covariance matrix. The method is validated in a simulation study and an application to actual supervisory data on banks’ total assets. JEL Classification: C18, C81, G21

Keywords: banking data; high dimension; missing data; outlier detection; robust regression; variable selection (search for similar items in EconPapers)
Date: 2018-07
New Economics Papers: this item is included in nep-ecm
Note: 1570817
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.ecb.europa.eu//pub/pdf/scpwps/ecb.wp2171.en.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ecb:ecbwps:20182171

Access Statistics for this paper

More papers in Working Paper Series from European Central Bank 60640 Frankfurt am Main, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Official Publications ().

 
Page updated 2025-03-19
Handle: RePEc:ecb:ecbwps:20182171