Alleviating Class Imbalance in Actuarial Applications Using Generative Adversarial Networks

Ngwenduna, Kwanda Sydwell; Mbuvha, Rendani

Alleviating Class Imbalance in Actuarial Applications Using Generative Adversarial Networks

Kwanda Sydwell Ngwenduna and Rendani Mbuvha
Additional contact information
Kwanda Sydwell Ngwenduna: School of Computer Science and Applied Mathematics, University of the Witwatersrand, West Campus, Mathematical Sciences Building, Private Bag 3, Wits, Braamfontein 2050, South Africa
Rendani Mbuvha: School of Statistics and Actuarial Science, University of the Witwatersrand, West Campus, Mathematical Sciences Building, Private Bag 3, Wits, Braamfontein 2050, South Africa

Risks, 2021, vol. 9, issue 3, 1-33

Abstract: To build adequate predictive models, a substantial amount of data is desirable. However, when expanding to new or unexplored territories, this required level of information is rarely always available. To build such models, actuaries often have to: procure data from local providers, use limited unsuitable industry and public research, or rely on extrapolations from other better-known markets. Another common pathology when applying machine learning techniques in actuarial domains is the prevalence of imbalanced classes where risk events of interest, such as mortality and fraud, are under-represented in data. In this work, we show how an implicit model using the Generative Adversarial Network (GAN) can alleviate these problems through the generation of adequate quality data from very limited or highly imbalanced samples. We provide an introduction to GANs and how they are used to synthesize data that accurately enhance the data resolution of very infrequent events and improve model robustness. Overall, we show a significant superiority of GANs for boosting predictive models when compared to competing approaches on benchmark data sets. This work offers numerous of contributions to actuaries with applications to inter alia new sample creation, data augmentation, boosting predictive models, anomaly detection, and missing data imputation.

Keywords: actuarial science; class imbalance; data augmentation; generative models; generative adversarial network; synthetic sampling; SMOTE (search for similar items in EconPapers)
JEL-codes: C G0 G1 G2 G3 K2 M2 M4 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://www.mdpi.com/2227-9091/9/3/49/pdf (application/pdf)
https://www.mdpi.com/2227-9091/9/3/49/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jrisks:v:9:y:2021:i:3:p:49-:d:513033

Access Statistics for this article

Risks is currently edited by Mr. Claude Zhang

More articles in Risks from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().