Redundancy Is Not Necessarily Detrimental in Classification Problems
Sebastián Alberto Grillo,
José Luis Vázquez Noguera,
Julio César Mello Román,
Miguel García-Torres,
Jacques Facon,
Diego P. Pinto-Roa,
Luis Salgueiro Romero,
Francisco Gómez-Vela,
Laura Raquel Bareiro Paniagua and
Deysi Natalia Leguizamon Correa
Additional contact information
Sebastián Alberto Grillo: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
José Luis Vázquez Noguera: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Julio César Mello Román: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Miguel García-Torres: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Jacques Facon: Department of Computer and Electronics, Universidade Federal do Espírito Santo, São Mateus 29932-540, Brazil
Diego P. Pinto-Roa: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Luis Salgueiro Romero: Signal Theory and Communications Department, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain
Francisco Gómez-Vela: Data Science and Big Data Lab, Universidad Pablo de Olavide, 41013 Seville, Spain
Laura Raquel Bareiro Paniagua: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Deysi Natalia Leguizamon Correa: Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Mathematics, 2021, vol. 9, issue 22, 1-22
Abstract:
In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant features on the performance of classification models. We can summarize the contribution of this work as follows: (i) develop a theoretical framework to analyze feature construction and selection, (ii) show that certain properly defined features are redundant but make the data linearly separable, and (iii) propose a formal criterion to validate feature construction methods. The results of experiments suggest that a large number of redundant features can reduce the classification error. The results imply that it is not enough to analyze features solely using criteria that measure the amount of information provided by such features.
Keywords: feature selection; feature construction; classification (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/9/22/2899/pdf (application/pdf)
https://www.mdpi.com/2227-7390/9/22/2899/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:9:y:2021:i:22:p:2899-:d:679141
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().