High-Cardinality Categorical Attributes and Credit Card Fraud Detection
Emanuel Mineda Carneiro (),
Carlos Henrique Quartucci Forster,
Lineu Fernando Stege Mialaret,
Luiz Alberto Vieira Dias and
Adilson Marques da Cunha
Additional contact information
Emanuel Mineda Carneiro: Sao Paulo State Technological College (Faculdade de Tecnologia—Fatec), Sao Jose dos Campos 12247-014, Brazil
Carlos Henrique Quartucci Forster: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil
Lineu Fernando Stege Mialaret: Federal Institute of Education, Science and Technology of Sao Paulo (Instituto Federal de Sao Paulo—IFSP), Jacarei 12322-030, Brazil
Luiz Alberto Vieira Dias: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil
Adilson Marques da Cunha: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil
Mathematics, 2022, vol. 10, issue 20, 1-23
Abstract:
Credit card transactions may contain some categorical attributes with large domains, involving up to hundreds of possible values, also known as high-cardinality attributes. The inclusion of such attributes makes analysis harder, due to results with poorer generalization and higher resource usage. A common practice is, therefore, to ignore such attributes, removing them, albeit wasting the information they provided. Contrariwise, this paper reports our findings on the positive impacts of using high-cardinality attributes on credit card fraud detection. Thus, we present a new algorithm for domain reduction that preserves the fraud-detection capabilities. Experiments applying a deep feedforward neural network on real datasets from a major Brazilian financial institution have shown that, when measured by the F-1 metric, the inclusion of such attributes does improve fraud-detection quality. As a main contribution, this proposed algorithm was able to reduce attribute cardinality, improving the training times of a model while preserving its predictive capabilities.
Keywords: credit card fraud; fraud-detection system; high-cardinality attribute; pattern recognition; clustering; deep learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/20/3808/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/20/3808/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:20:p:3808-:d:943211
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().