Statistical Entropy Measures in C4.5 Trees
Aldo Ramirez Arellano,
Juan Bory-Reyes and
Luis Manuel Hernandez-Simon
Additional contact information
Aldo Ramirez Arellano: Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Mexico City, Mexico
Juan Bory-Reyes: Escuela Superior de Ingeniería Mecánica y Eléctrica Zacatenco, Instituto Politécnico Nacional, Mexico City, Mexico
Luis Manuel Hernandez-Simon: Escuela Superior de Ingeniería Mecánica y Eléctrica Zacatenco, Instituto Politécnico Nacional, Mexico City, Mexico
International Journal of Data Warehousing and Mining (IJDWM), 2018, vol. 14, issue 1, 1-14
Abstract:
The main goal of this article is to present a statistical study of decision tree learning algorithms based on the measures of different parametric entropies. Partial empirical evidence is presented to support the conjecture that the parameter adjusting of different entropy measures might bias the classification. Here, the receiver operating characteristic (ROC) curve analysis, precisely, the area under the ROC curve (AURC) gives the best criterion to evaluate decision trees based on parametric entropies. The authors emphasize that the improvement of the AURC relies on of the type of each dataset. The results support the hypothesis that parametric algorithms are useful for datasets with numeric and nominal, but not for mixed, attributes; thus, four hybrid approaches are proposed. The hybrid algorithm, which is based on Renyi entropy, is suitable for nominal, numeric, and mixed datasets. Moreover, it requires less time when the number of nodes is reduced, when the AURC is maintaining or increasing, thus it is preferable in large datasets.
Date: 2018
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDWM.2018010101 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:14:y:2018:i:1:p:1-14
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().