Assessment of the Influence of Dependent Variable Distribution on Selected Goodness of Fit Measures Using the Example of Customer Churn Model
Migut Grzegorz ()
Additional contact information
Migut Grzegorz: StatSoft Polska sp. z o.o
Econometrics. Advances in Applied Data Analysis, 2020, vol. 24, issue 1, 51-70
Abstract:
Classification models enable optimal actions to be taken at every stage of the customer’s lifecycle. A circumstance affecting both the model building process and the assessment of their discriminatory power is the unbalanced distribution of the dichotomous dependent variable. The article focuses on the question of reliable assessment of the goodness of fit. The first part of the article reviews the measures of predictive power and then assesses the impact of the distribution of the dependent variable on the selected measures of goodness of fit. As a result, the high sensitivity of a number of measures such as lift, accuracy (ACC), or F-Score was observed. The sensitivity of MCC and Kappa Cohen’s measurements was also observed. Sensitivity (SENS) and specificity (SPEC), Youden’s index and measures based on ROC curves showed no such sensitivity. The conclusions obtained may allow the avoidance of misjudging the predictive power of models built for both learning and business practice.
Keywords: classification models; goodness of fit; unbalanced datasets; customer churn analysis (search for similar items in EconPapers)
JEL-codes: C10 C52 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.15611/eada.2020.1.05 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:vrs:eaiada:v:24:y:2020:i:1:p:51-70:n:5
DOI: 10.15611/eada.2020.1.05
Access Statistics for this article
Econometrics. Advances in Applied Data Analysis is currently edited by Józef Dziechciarz
More articles in Econometrics. Advances in Applied Data Analysis from Sciendo
Bibliographic data for series maintained by Peter Golla ().