Generative Synthesis of Insurance Datasets

Kuo, Kevin

Generative Synthesis of Insurance Datasets

Kevin Kuo

Abstract: One of the impediments in advancing actuarial research and developing open source assets for insurance analytics is the lack of realistic publicly available datasets. In this work, we develop a workflow for synthesizing insurance datasets leveraging CTGAN, a recently proposed neural network architecture for generating tabular data. Applying the proposed workflow to publicly available data in the domains of general insurance pricing and life insurance shock lapse modeling, we evaluate the synthesized datasets from a few perspectives: machine learning efficacy, distributions of variables, and stability of model parameters. This workflow is implemented via an R interface to promote adoption by researchers and data owners.

Date: 2019-12, Revised 2020-08
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ias and nep-rmg
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/1912.02423 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:1912.02423

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().