Generative Synthesis of Insurance Datasets
Kevin Kuo
Papers from arXiv.org
Abstract:
One of the impediments in advancing actuarial research and developing open source assets for insurance analytics is the lack of realistic publicly available datasets. In this work, we develop a workflow for synthesizing insurance datasets leveraging CTGAN, a recently proposed neural network architecture for generating tabular data. Applying the proposed workflow to publicly available data in the domains of general insurance pricing and life insurance shock lapse modeling, we evaluate the synthesized datasets from a few perspectives: machine learning efficacy, distributions of variables, and stability of model parameters. This workflow is implemented via an R interface to promote adoption by researchers and data owners.
Date: 2019-12, Revised 2020-08
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ias and nep-rmg
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/1912.02423 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:1912.02423
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().