Enhanced financial fraud detection using cost‐sensitive cascade forest with missing value imputation
Lukui Huang,
Alan Abrahams and
Peter Ractham
Intelligent Systems in Accounting, Finance and Management, 2022, vol. 29, issue 3, 133-155
Abstract:
Financial statement fraud is a global problem for investors, audit firms, regulators, and other stakeholders. Fraud detection can be regarded as a binary classification problem with a false negative being more expensive than a false positive. Although existing studies have made great efforts to detect fraud using various data‐mining techniques, the difference in misclassification costs is seldom considered. In this study, we propose a cost‐sensitive cascade forest (CSCF) for fraud detection, which places heavy penalty on false negative prediction and self‐adjusts the depth of a cascade forest according to the classifier’s recall (i.e. the classifier’s sensitivity). As missing values are ubiquitous in fraud research, we also explore the effect of selected missing data treatments on prediction performance, including complete case analysis, three selected classic statistical mechanisms (zero, mean, and modified mean imputation), and two machine learning (K‐nearest neighbor [KNN] and random forest [RF]) approaches. The experimental results show that the proposed CSCF significantly improves the fraud prediction in comparison with one of the latest fraud detection models using the RUSBoost algorithm. Comparing different missing value treatments, even though RUSBoost and CSCF perform well when using complete case analysis, we find that the best performance is achieved when CSCF is used with missing data imputed as zero. Such treatment further improves the performance, and results in an area under curve (AUC) score of 0.82 compared to the highest AUC (0.71) from the baseline model. Supplementary analysis further reveals that the low AUC of complete case analysis for the two examined models persists under different training sizes. Thus, our findings shed light on the potential benefits of missing value imputation for the model’s performance for fraud detection.
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/isaf.1517
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:isacfm:v:29:y:2022:i:3:p:133-155
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=1099-1174
Access Statistics for this article
More articles in Intelligent Systems in Accounting, Finance and Management from John Wiley & Sons, Ltd.
Bibliographic data for series maintained by Wiley Content Delivery ().