Predicting the Performance of Ensemble Classification Using Conditional Joint Probability

Murtza, Iqbal; Kim, Jin-Young; Adnan, Muhammad

Predicting the Performance of Ensemble Classification Using Conditional Joint Probability

Iqbal Murtza, Jin-Young Kim () and Muhammad Adnan
Additional contact information
Iqbal Murtza: Education and Research Center for IoT Convergence Intelligent City Safety Platform, Chonnam National University, Gwangju 61186, Republic of Korea
Jin-Young Kim: Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea
Muhammad Adnan: Department of Technology and Safety, UiT the Arctic University of Norway, 9019 Tromsø, Norway

Mathematics, 2024, vol. 12, issue 16, 1-16

Abstract: In many machine learning applications, there are many scenarios when performance is not satisfactory by single classifiers. In this case, an ensemble classification is constructed using several weak base learners to achieve satisfactory performance. Unluckily, the construction of the ensemble classification is empirical, i.e., to try an ensemble classification and if performance is not satisfactory then discard it. In this paper, a challenging analytical problem of the estimation of ensemble classification using the prediction performance of the base learners is considered. The proposed formulation is aimed at estimating the performance of ensemble classification without physically developing it, and it is derived from the perspective of probability theory by manipulating the decision probabilities of the base learners. For this purpose, the output of a base learner (which is either true positive, true negative, false positive, or false negative) is considered as a random variable. Then, the effects of logical disjunction-based and majority voting-based decision combination strategies are analyzed from the perspective of conditional joint probability. To evaluate the forecasted performance of ensemble classifier by the proposed methodology, publicly available standard datasets have been employed. The results show the effectiveness of the derived formulations to estimate the performance of ensemble classification. In addition to this, the theoretical and experimental results show that the logical disjunction-based decision outperforms majority voting in imbalanced datasets and cost-sensitive scenarios.

Keywords: machine learning; probability theory; ensemble classification; cost-sensitive learning; binary classification (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/16/2586/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/16/2586/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:16:p:2586-:d:1461048

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().