Synthetic Dataset Generation of Driver Telematics
Banghee So,
Jean-Philippe Boucher and
Emiliano A. Valdez
Additional contact information
Banghee So: Department of Mathematics, University of Connecticut, 341 Mansfield Road, Storrs, CT 06269-1009, USA
Jean-Philippe Boucher: Département de Mathématiques, Université du Québec à Montréal, 201, Avenue du Président-Kennedy, Montréal, QC H2X 3Y7, Canada
Emiliano A. Valdez: Department of Mathematics, University of Connecticut, 341 Mansfield Road, Storrs, CT 06269-1009, USA
Risks, 2021, vol. 9, issue 4, 1-19
Abstract:
This article describes the techniques employed in the production of a synthetic dataset of driver telematics emulated from a similar real insurance dataset. The synthetic dataset generated has 100,000 policies that included observations regarding driver’s claims experience, together with associated classical risk variables and telematics-related variables. This work is aimed to produce a resource that can be used to advance models to assess risks for usage-based insurance. It follows a three-stage process while using machine learning algorithms. In the first stage, a synthetic portfolio of the space of feature variables is generated applying an extended SMOTE algorithm. The second stage is simulating values for the number of claims as multiple binary classifications applying feedforward neural networks. The third stage is simulating values for aggregated amount of claims as regression using feedforward neural networks, with number of claims included in the set of feature variables. The resulting dataset is evaluated by comparing the synthetic and real datasets when Poisson and gamma regression models are fitted to the respective data. Other visualization and data summarization produce remarkable similar statistics between the two datasets. We hope that researchers interested in obtaining telematics datasets to calibrate models or learning algorithms will find our work ot be valuable.
Keywords: Bayesian optimization; Gaussian process; neural network; SMOTE; usage-based insurance (UBI); vehicle telematics (search for similar items in EconPapers)
JEL-codes: C G0 G1 G2 G3 K2 M2 M4 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.mdpi.com/2227-9091/9/4/58/pdf (application/pdf)
https://www.mdpi.com/2227-9091/9/4/58/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jrisks:v:9:y:2021:i:4:p:58-:d:523212
Access Statistics for this article
Risks is currently edited by Mr. Claude Zhang
More articles in Risks from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().