Symbolic data analysis as a tool for credit fraud detection

Dudek, Andrzej; Pełka, Marcin

Symbolic data analysis as a tool for credit fraud detection

Andrzej Dudek () and Marcin Pełka ()
Additional contact information
Andrzej Dudek: Uniwersytet Ekonomiczny we Wrocławiu
Marcin Pełka: Uniwersytet Ekonomiczny we Wrocławiu

Bank i Kredyt, 2022, vol. 53, issue 6, 587-604

Abstract: It can be said that the money fraud problem is as old as money itself. The development of new technologies allows criminals to develop new ways of fraud and also provides new methods to prevent them. The process of identifying if a newly authorised transaction is a case of fraudulent or genuine transaction is called fraud detection (Maes et al. 2002). Many classical methods can be used to detect money frauds. This paper proposes to apply symbolic data analysis methods, which allow describing objects in a more precise and complex way in order to handle the credit card fraud detection problem. The main hypothesis is that the decision tree for symbolic data is a better tool in credit card fraud detection than other methods. Symbolic data analysis, unlike classical data analysis, allows describing objects in a more complex way. Symbolic data analysis makes it possible to take into account all variability and uncertainty in the data and provides suitable methods and techniques to deal with such data (see: Bock, Diday 2000; Billard, Diday 2006). The first part is the introduction that describes the problem of credit card fraud detection and presents literature that deals with this problem. The second part presents the basic ideas of symbolic data analysis, describes all the models that will be applied in the empirical part (decision tree for symbolic data, logistic regression for symbolic data, k-nearest neighbour method for symbolic data and kernel discriminant analysis for symbolic data). The third part presents the results of credit card fraud detection. The data set containing 284,807 different card transactions (492 being fraud transactions) is used to build all models. The obtained results show that decision trees usually lead to slightly better results than other methods in the symbolic data case (for a single model). The last part presents the final remarks.

Keywords: credit card; fraud detection; symbolic data; machine learning; R software (search for similar items in EconPapers)
JEL-codes: C02 C19 C38 G2 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://bankikredyt.nbp.pl/content/2022/06/bik_06_2022_02.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nbp:nbpbik:v:53:y:2022:i:6:p:587-604

Access Statistics for this article

More articles in Bank i Kredyt from Narodowy Bank Polski Contact information at EDIRC.
Bibliographic data for series maintained by Wojciech Burjanek ().