A fused grey wolf and artificial bee colony model for imbalanced data classification problems
Kusum Kumari Bharti (),
Ashutosh Tripathi () and
Mohona Ghosh ()
Additional contact information
Kusum Kumari Bharti: Dr. B . R. Ambedkar National Institute of Technology
Ashutosh Tripathi: Pandit Deendayal Energy University
Mohona Ghosh: Indira Gandhi Delhi Technical University For Women
International Journal of System Assurance Engineering and Management, 2024, vol. 15, issue 8, No 37, 4085-4104
Abstract:
Abstract The issue of imbalanced datasets, i.e., uneven sample distribution among different classes causes training biases and degrades learning algorithm performance. In past, several solutions for data imbalance handling have been proposed but most of them focus on removing the majority class instances, leading to loss of important information. An alternate strategy to mitigate this issue that has been investigated in literature is minority class samples generation. However, generation of quality synthetic samples for minority class remains an open problem. In this study, a fusion of grey wolf optimizer (GWO) with artificial bee colony (ABC) is proposed to generate good representative samples of the minority class. The combination is analysed because GWO has good exploitation abilities, while ABC is good at exploration. The effectiveness of the proposed method is tested on 20 real-world benchmark datasets and on one real-life application, i.e., scam video classification on YouTube using standard assessment indicators. The performance of the proposed method is compared against 18 state-of-the-art data imbalance handling methods using three classification algorithms, i.e., support vector machine (SVM), k-nearest neighbours (KNN) and decision tree (DT). Our experimental results show an improvement in G-mean score on 18 out of 20 datasets with a maximum improvement of 8% for SVM, and on 17 out of 20 datasets with a maximum improvement of 10.7% for KNN and 6.3% for DT respectively. An improvement in AUC score is also seen on 17 out of 20 datasets for SVM and DT with a maximum improvement of 4.5% and 6% respectively, and on 16 out of 20 datasets for KNN with a maximum improvement of 7.7%. These results show that the proposed method is robust.
Keywords: Swarm intelligence; Grey wolf optimization; Artificial bee colony; Imbalanced datasets; Oversampling; Machine learning; Scam video classification (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s13198-024-02412-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:ijsaem:v:15:y:2024:i:8:d:10.1007_s13198-024-02412-w
Ordering information: This journal article can be ordered from
http://www.springer.com/engineering/journal/13198
DOI: 10.1007/s13198-024-02412-w
Access Statistics for this article
International Journal of System Assurance Engineering and Management is currently edited by P.K. Kapur, A.K. Verma and U. Kumar
More articles in International Journal of System Assurance Engineering and Management from Springer, The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().