Concentration bounds for the empirical angular measure with statistical learning applications
Stéphan Clémençon,
Hamid Jalalzai,
Anne Sabourin,
Stéphane and
Johan Segers ()
Additional contact information
Stéphan Clémençon: Télécom Paris, France
Hamid Jalalzai: Télécom Paris, France
Anne Sabourin: Télécom Paris, France
Stéphane: Université catholique de Louvain, LIDAM/ISBA, Belgium
Johan Segers: Université catholique de Louvain, LIDAM/ISBA, Belgium
No 2021023, LIDAM Discussion Papers ISBA from Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA)
Abstract:
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation when the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. However, the study of the sampling distribution of the resulting empirical angular measure is challenging. It is the purpose of the paper to establish finite-sample bounds for the maximal deviations between the empirical and true angular measures, uniformly over classes of Borel sets of controlled combinatorial complexity. The bounds are valid with high probability and scale essentially as the square root of the effective sample size, up to a logarithmic factor. Discarding the most extreme observations yields a truncated version of the empirical angular measure for which the logarithmic factor in the concentration bound is replaced by a factor depending on the truncation level. The bounds are applied to provide performance guarantees for two statistical learning procedures tailored to extreme regions of the input space and built upon the empirical angular measure: binary classification in extreme regions through empirical risk minimization and unsupervised anomaly detection through minimum-volume sets of the sphere.
Keywords: angular measure; classiffication; concentration inequality; extreme value statistics; extreme value statistics; minimum-volume sets; ranks (search for similar items in EconPapers)
Date: 2021-01-01
New Economics Papers: this item is included in nep-ecm
References: Add references at CitEc
Citations:
Downloads: (external link)
https://dial.uclouvain.be/pr/boreal/fr/object/bore ... tastream/PDF_01/view (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:aiz:louvad:2021023
Access Statistics for this paper
More papers in LIDAM Discussion Papers ISBA from Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA) Voie du Roman Pays 20, 1348 Louvain-la-Neuve (Belgium). Contact information at EDIRC.
Bibliographic data for series maintained by Nadja Peiffer ().