EconPapers    
Economics at your fingertips  
 

Clustering Longitudinal Life-Course Sequences using Mixtures of Exponential-Distance Models

Keefe Murphy, Brendan Murphy, Raffaella Piccarreta and Isobel Claire Gormley
Additional contact information
Keefe Murphy: University College Dublin

No f5n8k, SocArXiv from Center for Open Science

Abstract: Sequence analysis is an increasingly popular approach for the analysis of life courses represented by an ordered collection of activities experienced by subjects over a given time period. Several criteria exist for measuring pairwise dissimilarities among sequences. Typically, dissimilarity matrices are employed as input to heuristic clustering algorithms, with the aim of identifying the most relevant patterns in the data. Here, we propose a model-based clustering approach for categorical sequence data. The technique is applied to a survey data set containing information on the career trajectories of a cohort of Northern Irish youths tracked between the ages of 16 and 22. Specifically, we develop a family of methods for clustering sequences directly, based on mixtures of exponential-distance models, which we call MEDseq. The use of the Hamming distance or weighted variants thereof as the distance metrics permits closed-form expressions for the normalising constant, thereby facilitating the development of an ECM algorithm for model fitting. Additionally, MEDseq models allow the probability of component membership to depend on fixed covariates. Sampling weights, which are often associated with life-course data arising from surveys, are also accommodated. Simultaneously including weights and covariates in the clustering process yields new insights on the Northern Irish data.

Date: 2019-12-05
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://osf.io/download/5de7c691e1e62f000a334c46/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:f5n8k

DOI: 10.31219/osf.io/f5n8k

Access Statistics for this paper

More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:socarx:f5n8k