EconPapers    
Economics at your fingertips  
 

Model-Based and Nonparametric Approaches to Clustering for Data Compression in Actuarial Applications

Adrian O’Hagan and Colm Ferrari

North American Actuarial Journal, 2017, vol. 21, issue 1, 107-146

Abstract: Clustering is used by actuaries in a data compression process to make massive or nested stochastic simulations practical to run. A large data set of assets or liabilities is partitioned into a user-defined number of clusters, each of which is compressed to a single representative policy. The representative policies can then simulate the behavior of the entire portfolio over a large range of stochastic scenarios. Such processes are becoming increasingly important in understanding product behavior and assessing reserving requirements in a big-data environment. This article proposes a variety of clustering techniques that can be used for this purpose. Initialization methods for performing clustering compression are also compared, including principal components, factor analysis, and segmentation. A variety of methods for choosing a cluster's representative policy is considered. A real data set comprising variable annuity policies, provided by Milliman, is used to test the proposed methods. It is found that the compressed data sets produced by the new methods, namely, model-based clustering, Ward's minimum variance hierarchical clustering, and k-medoids clustering, can replicate the behavior of the uncompressed (seriatim) data more accurately than those obtained by the existing Milliman method. This is verified within sample by examining location variable totals of the representative policies versus the uncompressed data at the five levels of compression of interest. More crucially it is also verified out of sample by comparing the distributions of the present values of several variables after 20 years across 1000 simulated scenarios based on the compressed and seriatim data, using Kolmogorov-Smirnov goodness-of-fit tests and weighted sums of squared differences.

Date: 2017
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://hdl.handle.net/10.1080/10920277.2016.1234398 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:uaajxx:v:21:y:2017:i:1:p:107-146

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/uaaj20

DOI: 10.1080/10920277.2016.1234398

Access Statistics for this article

North American Actuarial Journal is currently edited by Kathryn Baker

More articles in North American Actuarial Journal from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:uaajxx:v:21:y:2017:i:1:p:107-146