CyberRisk Prediction using Machine Learning and Extreme Value Theory
Jules Sadefo Kamdem and
Danielle Selambi Kapsa ()
Additional contact information
Danielle Selambi Kapsa: African Institute for Mathematical Sciences (AIMS), Limbe Crystal Gardens, P.O. Box 608, Limbe, South West Region, Cameroon
Post-Print from HAL
Abstract:
This paper develops a hybrid framework for quantifying the financial impact of data breaches by combining predictive machine learning with extreme value theory (EVT). Using incident-level breach data from the Privacy Rights Clearinghouse (PRC) covering the period 2005–2020, we first estimate the number of compromised records with a Random Forest model trained on organizational, temporal, and attack-type characteristics. We then analyze the tail behavior of the predicted losses to capture the fat-tailed distribution of cyber risks. Our results indicate that the distribution of affected records is well represented by a Fr´echet law, and we estimate the parameters of the Generalized Extreme Value (GEV) distribution to compute Value-at-Risk (VaR) at high confidence levels. This two-stage approach provides a rigorous assessment of maximum potential losses, addressing the question of cyber-risk insurability. By linking predictive accuracy with tail risk quantification, our findings deliver actionable insights for insurers, regulators, and organizations seeking to anticipate and manage the f inancial consequences of large-scale data breaches.
Keywords: Cyber risk; Cyber insurance; Machine learning; Random Forest; Extreme Value Theory; Value-at-Risk (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Published in Information Systems Frontiers, In press
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-05393158
Access Statistics for this paper
More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().