A New Tool for Robust Estimation and Identification of Unusual Data Points
Christian Garciga and
Randal Verbrugge
No 20-08, Working Papers from Federal Reserve Bank of Cleveland
Abstract:
Most consistent estimators are what Müller (2007) terms “highly fragile”: prone to total breakdown in the presence of a handful of unusual data points. This compromises inference. Robust estimation is a (seldom-used) solution, but commonly used methods have drawbacks. In this paper, building on methods that are relatively unknown in economics, we provide a new tool for robust estimates of mean and covariance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. Our performance testing indicates that our baseline method performs on par with, or better than, two of the currently best available methods, and that it works well on benchmark data sets. We also demonstrate that the issues we discuss are not merely hypothetical, by re-examining a prominent economic study and demonstrating its central results are driven by a set of unusual points.
Keywords: big data; machine learning; robust estimation; detMCD; RMVN; fragility; outlier identification (search for similar items in EconPapers)
JEL-codes: C3 C4 C5 (search for similar items in EconPapers)
Pages: 52
Date: 2020-03-05
New Economics Papers: this item is included in nep-big, nep-cmp and nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.26509/frbc-wp-202008 Full Text
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:fip:fedcwq:87580
Ordering information: This working paper can be ordered from
DOI: 10.26509/frbc-wp-202008
Access Statistics for this paper
More papers in Working Papers from Federal Reserve Bank of Cleveland Contact information at EDIRC.
Bibliographic data for series maintained by 4D Library ().