EconPapers    
Economics at your fingertips  
 

Latent Dirichlet Allocation for structured insurance data

Charlotte Jamotton () and Donatien Hainaut ()
Additional contact information
Charlotte Jamotton: Université catholique de Louvain, LIDAM/ISBA, Belgium
Donatien Hainaut: Université catholique de Louvain, LIDAM/ISBA, Belgium

No 2024008, LIDAM Discussion Papers ISBA from Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA)

Abstract: This article explores the application of Latent Dirichlet Allocation (LDA) to structured tabular insurance data. LDA is a probabilistic topic modelling approach initially developed in Natural Language Processing (NLP) to uncover the underlying structure of (unstructured) textual data. It was designed to represent textual documents as mixture of latent (hidden) topics, and topics as mixtures of words. This study introduces the LDA’s document-topic distribution as a soft clustering tool for unsupervised learningtasks in the actuarial field. By defining each topic as a risk profile, and by treating insurance policies as documents and the modalities of categorical covariates as words, we show how LDA can be extended beyond textual data and can offer a framework to uncover underlying structures within insurance portfolios. Our experimental results and analysis highlight how the modelling of policies based on topic cluster membership, and the identification of dominant modalities within each risk profile, can give insights into the prominent risk factors contributing to higher or lower claim frequencies.

Keywords: Latent dirichlet allocation; topic modelling; soft clustering; insurance data; risk profile; natural language processing (search for similar items in EconPapers)
Pages: 27
Date: 2024-03-08
New Economics Papers: this item is included in nep-big, nep-cmp and nep-rmg
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://dial.uclouvain.be/pr/boreal/en/object/bore ... tastream/PDF_01/view (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:aiz:louvad:2024008

Access Statistics for this paper

More papers in LIDAM Discussion Papers ISBA from Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA) Voie du Roman Pays 20, 1348 Louvain-la-Neuve (Belgium). Contact information at EDIRC.
Bibliographic data for series maintained by Nadja Peiffer ().

 
Page updated 2025-03-22
Handle: RePEc:aiz:louvad:2024008