EconPapers    
Economics at your fingertips  
 

A multivariate extreme value theory approach to anomaly clustering and visualization

Maël Chiapino, Stephan Clémençon, Vincent Feuillard and Anne Sabourin ()
Additional contact information
Maël Chiapino: LTCI, Télécom Paris, Institut polytechnique de Paris
Stephan Clémençon: LTCI, Télécom Paris, Institut polytechnique de Paris
Vincent Feuillard: Airbus Central R&T, AI Research
Anne Sabourin: LTCI, Télécom Paris, Institut polytechnique de Paris

Computational Statistics, 2020, vol. 35, issue 2, No 9, 607-628

Abstract: Abstract In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector $$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$X=(X1,…,Xd) valued in $$\mathbb {R}^d$$Rd, correspond to the simultaneous occurrence of extreme values for certain subgroups $$\alpha \subset \{1,\; \ldots ,\; d \}$$α⊂{1,…,d} of variables $$X_j$$Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type $$\alpha $$α is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type $$\alpha $$α, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.

Keywords: Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s00180-019-00913-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:35:y:2020:i:2:d:10.1007_s00180-019-00913-y

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-019-00913-y

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:compst:v:35:y:2020:i:2:d:10.1007_s00180-019-00913-y