Predicting credit card customer churn in banks using data mining
Dudyala Anil Kumar and
V. Ravi
International Journal of Data Analysis Techniques and Strategies, 2008, vol. 1, issue 1, 4-28
Abstract:
In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligence Cup organised by the University of Chile in 2004. Since it is a highly unbalanced dataset with 93% loyal and 7% churned customers, we employed (1) undersampling, (2) oversampling, (3) a combination of undersampling and oversampling and (4) the Synthetic Minority Oversampling Technique (SMOTE) for balancing it. Furthermore, tenfold cross-validation was employed. The results indicated that SMOTE achieved good overall accuracy. Also, SMOTE and a combination of undersampling and oversampling improved the sensitivity and overall accuracy in majority voting. In addition, the Classification and Regression Tree (CART) was used for the purpose of feature selection. The reduced feature set was fed to the classifiers mentioned above. Thus, this paper outlines the most important predictor variables in solving the credit card churn prediction problem. Moreover, the rules generated by decision tree J48 act as an early warning expert system.
Keywords: credit cards; credit card churn; data mining; churn prediction; multilayer perceptron; MLP; logistic regression; decision tree; random forest; radial basis function; RBF neural networks; support vector machine; SVM; synthetic minority oversampling technique; SMOTE; undersampling; oversampling; expert systems. (search for similar items in EconPapers)
Date: 2008
References: Add references at CitEc
Citations: View citations in EconPapers (15)
Downloads: (external link)
http://www.inderscience.com/link.php?id=20020 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:1:y:2008:i:1:p:4-28
Access Statistics for this article
More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().