EconPapers    
Economics at your fingertips  
 

Prediction of Attrition in Large Longitudinal Studies: Tree-based methods versus Multinomial Logistic Models

Katherine Laura Best, Lydia Gabriela Speyer, Aja Louise Murray and Anastasia Ushakova

No tyszr, SocArXiv from Center for Open Science

Abstract: Identifying predictors of attrition is essential for designing longitudinal studies such that attrition bias can be minimised, and for identifying the variables that can be used as auxiliary in statistical techniques to help correct for non-random drop-out. This paper provides a comparative overview of predictive techniques that can be used to model attrition and identify important risk factors that help in its prediction. Logistic regression and several tree-based machine learning methods were applied to Wave 2 dropout in an illustrative sample of 5000 individuals from a large UK longitudinal study, Understanding Society. Each method was evaluated based on accuracy, AUC-ROC, plausibility of key assumptions and interpretability. Our results suggest a 10% improvement in accuracy for random forest compared to logistic regression methods. However, given the differences in estimation procedures we suggest that both models could be used in conjunction to provide the most comprehensive understanding of attrition predictors.

Date: 2021-03-02
New Economics Papers: this item is included in nep-big, nep-cmp and nep-ecm
References: Add references at CitEc
Citations:

Downloads: (external link)
https://osf.io/download/603d4b59035cf702bfc831d3/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:tyszr

DOI: 10.31219/osf.io/tyszr

Access Statistics for this paper

More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:socarx:tyszr