Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling

de Caigny, Arno; Coussement, Kristof; Meire, Matthijs; Hoornaert, Steven

Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling

Arno de Caigny (), Kristof Coussement (), Matthijs Meire and Steven Hoornaert
Additional contact information
Arno de Caigny: LEM - Lille économie management - UMR 9221 - UA - Université d'Artois - UCL - Université catholique de Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Kristof Coussement: LEM - Lille économie management - UMR 9221 - UA - Université d'Artois - UCL - Université catholique de Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Matthijs Meire: LEM - Lille économie management - UMR 9221 - UA - Université d'Artois - UCL - Université catholique de Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique
Steven Hoornaert: LEM - Lille économie management - UMR 9221 - UA - Université d'Artois - UCL - Université catholique de Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique

Post-Print from HAL

Abstract: Given the growing interest in using AI and analytics to support CRM decision making, we discuss why undersampling and bagging are popular prediction techniques in customer churn prediction (CCP). The former helps in tackling the class imbalance problem and the latter improves model stability. However, extant CCP literature is unclear on the impact of undersampling on model stability and predictive performance, while bagging has difficulties in handling the class imbalance problem. Therefore, we extend existing CCP research to benchmark underbagging, which combines undersampling and bagging. Having both prediction techniques combined we recuperate customer data that would have been lost in undersampling by using them in multiple bags and passing an undersampled, more balanced training set to the classifier. In an extensive experiment including 11 real-life CCP datasets, underbagging is benchmarked against its constituents and other popular CCP classifiers in terms of predictive performance, profit and operational efficiency. Our results indicate that underbagging is a valid and reliable alternative framework for CCP prediction.

Date: 2025-02-11
References: Add references at CitEc
Citations:

Published in Annals of Operations Research, 2025, 346 (3), pp.2401-2421. ⟨10.1007/s10479-025-06516-9⟩

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-05114891

DOI: 10.1007/s10479-025-06516-9

Access Statistics for this paper

More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().