Distance-based outlier detection for high dimension, low sample size data
Jeongyoun Ahn,
Myung Hee Lee and
Jung Ae Lee
Journal of Applied Statistics, 2019, vol. 46, issue 1, 13-29
Abstract:
Despite the popularity of high dimension, low sample size data analysis, there has not been enough attention to the sample integrity issue, in particular, a possibility of outliers in the data. A new outlier detection procedure for data with much larger dimensionality than the sample size is presented. The proposed method is motivated by asymptotic properties of high-dimensional distance measures. Empirical studies suggest that high-dimensional outlier detection is more likely to suffer from a swamping effect rather than a masking effect, thus yields more false positives than false negatives. We compare the proposed approaches with existing methods using simulated data from various population settings. A real data example is presented with a consideration on the implication of found outliers.
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://hdl.handle.net/10.1080/02664763.2018.1452901 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:japsta:v:46:y:2019:i:1:p:13-29
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/CJAS20
DOI: 10.1080/02664763.2018.1452901
Access Statistics for this article
Journal of Applied Statistics is currently edited by Robert Aykroyd
More articles in Journal of Applied Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().