Customized Instance Random Undersampling to Increase Knowledge Management for Multiclass Imbalanced Data Classification
Claudia C. Tusell-Rey,
Oscar Camacho-Nieto,
Cornelio Yáñez-Márquez () and
Yenny Villuendas-Rey ()
Additional contact information
Claudia C. Tusell-Rey: Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz s/n, GAM, Ciudad de México 07700, Mexico
Oscar Camacho-Nieto: Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, GAM, Ciudad de México 07700, Mexico
Cornelio Yáñez-Márquez: Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz s/n, GAM, Ciudad de México 07700, Mexico
Yenny Villuendas-Rey: Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, GAM, Ciudad de México 07700, Mexico
Sustainability, 2022, vol. 14, issue 21, 1-16
Abstract:
Imbalanced data constitutes a challenge for knowledge management. This problem is even more complex in the presence of hybrid (numeric and categorical data) having missing values and multiple decision classes. Unfortunately, health-related information is often multiclass, hybrid, and imbalanced. This paper introduces a novel undersampling procedure that deals with multiclass hybrid data. We explore its impact on the performance of the recently proposed customized naïve associative classifier (CNAC). The experiments made, and the statistical analysis, show that the proposed method surpasses existing classifiers, with the advantage of being able to deal with multiclass, hybrid, and incomplete data with a low computational cost. In addition, our experiments showed that the CNAC benefits from data sampling; therefore, we recommend using the proposed undersampling procedure to balance data for CNAC.
Keywords: imbalanced data classification; knowledge management; undersampling; decision-making; artificial intelligence (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/14/21/14398/pdf (application/pdf)
https://www.mdpi.com/2071-1050/14/21/14398/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:14:y:2022:i:21:p:14398-:d:962086
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().