Prediction of customer churn risk with advanced machine learning methods
Oguzhan Akan,
Abhishek Verma and
Sonika Sharma
International Journal of Data Science, 2025, vol. 10, issue 1, 70-95
Abstract:
Customer churn risk prediction is an important area of research as it directly impacts the revenue stream of businesses. An ability to predict customer churn allows businesses to come up with better strategies to retain existing customers. In this research we perform a comprehensive comparison of feature selection methods, upsampling methods, and machine learning methods on the customer churn risk dataset: i) Our research compares likelihood-based, tree-based, and layer-based machine learning methods on the churn dataset; ii) Models built on the churn dataset without upsampling performed better than oversampling methods. However, synthetic minority oversampling technique (SMOTE) and adaptive synthetic sampling (ADASYN) helped stabilise model performance; iii) the models built on ADASYN dataset were slightly better than the SMOTE counterparts; iv) it was observed that XGBoost and deep cascading forest (DCF) combined with XGBoost were consistently better across all metrics compared to other methods; and v) information Value analysis performed better than PCA. In particular, IVR DCFX model has the best AUROC score with 89.1%.
Keywords: customer churn; DNNs; deep neural networks; DCF; deep cascading forest; SMOTE; synthetic minority oversampling technique; ADASYN; adaptive synthetic sampling. (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=144832 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdsci:v:10:y:2025:i:1:p:70-95
Access Statistics for this article
More articles in International Journal of Data Science from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().