Combining Predictions of Auto Insurance Claims

Ye, Chenglong; Zhang, Lin; Han, Mingxuan; Yu, Yanjia; Zhao, Bingxin; Yang, Yuhong

Combining Predictions of Auto Insurance Claims

Chenglong Ye, Lin Zhang, Mingxuan Han, Yanjia Yu, Bingxin Zhao and Yuhong Yang
Additional contact information
Chenglong Ye: Dr. Bing Zhang Department of Statistics, University of Kentucky, 317 Multidisciplinary Science Building, 725 Rose St., Lexington, KY 40536, USA
Lin Zhang: First American Financial, Santa Ana, CA 92707, USA
Mingxuan Han: School of Computing, University of Utah, Salt Lake City, UT 84112, USA
Yanjia Yu: School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
Bingxin Zhao: School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
Yuhong Yang: School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA

Econometrics, 2022, vol. 10, issue 2, 1-15

Abstract: This paper aims to better predict highly skewed auto insurance claims by combining candidate predictions. We analyze a version of the Kangaroo Auto Insurance company data and study the effects of combining different methods using five measures of prediction accuracy. The results show the following. First, when there is an outstanding (in terms of Gini Index) prediction among the candidates, the “forecast combination puzzle” phenomenon disappears. The simple average method performs much worse than the more sophisticated model combination methods, indicating that combining different methods could help us avoid performance degradation. Second, the choice of the prediction accuracy measure is crucial in defining the best candidate prediction for “low frequency and high severity” (LFHS) data. For example, mean square error (MSE) does not distinguish well between model combination methods, as the values are close. Third, the performances of different model combination methods can differ drastically. We propose using a new model combination method, named ARM-Tweedie, for such LFHS data; it benefits from an optimal rate of convergence and exhibits a desirable performance in several measures for the Kangaroo data. Fourth, overall, model combination methods improve the prediction accuracy for auto insurance claim costs. In particular, Adaptive Regression by Mixing (ARM), ARM-Tweedie, and constrained Linear Regression can improve forecast performance when there are only weak learners or when no dominant learner exists.

Keywords: claim cost prediction; auto insurance; normalized Gini index; Tweedie distribution; model averaging (search for similar items in EconPapers)
JEL-codes: B23 C C00 C01 C1 C2 C3 C4 C5 C8 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2225-1146/10/2/19/pdf (application/pdf)
https://www.mdpi.com/2225-1146/10/2/19/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jecnmx:v:10:y:2022:i:2:p:19-:d:791038

Access Statistics for this article

Econometrics is currently edited by Ms. Jasmine Liu

More articles in Econometrics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().