EconPapers    
Economics at your fingertips  
 

Analyzing the Role of Class Rebalancing Techniques in Software Defect Prediction

Yousef Alqasrawi, Mohammad Azzeh () and Yousef Elsheikh ()
Additional contact information
Yousef Alqasrawi: Faculty of Information Technology, Applied Science Private University, Amman, Jordan
Mohammad Azzeh: Department of Data Science, Princess Sumaya University for Technology, Amman, Jordan
Yousef Elsheikh: Faculty of Information Technology, Applied Science Private University, Amman, Jordan

International Journal of Information Technology & Decision Making (IJITDM), 2024, vol. 23, issue 06, 2167-2207

Abstract: Predicting software defects is an important task during software testing phase, especially for allocating appropriate resources and prioritizing testing tasks. Typically, classification algorithms are used to accomplish this task by using previously collected datasets. However, these datasets suffer from imbalanced label distribution where clean modules outnumber defective modules. Traditional classification algorithms cannot handle this nature in defect datasets because they assume the datasets are balanced. Failing to address this problem, the classification algorithm will produce a prediction biased towards the majority label. In the literature, there are several techniques designed to address this problem and most of them focus on data re-balancing. Recently, ensemble class imbalance techniques have emerged as an opposing approach to data rebalancing approaches. Regarding the software defect prediction, there are no studies examining the performance of ensemble class imbalance learning against data re-balancing approaches. This paper investigates the efficiency of ensemble class imbalance learning for software defect prediction. We conducted a comprehensive experiment that involved 12 datasets, six classifiers, nine class imbalance techniques, and 10 evaluation metrics. The experiments showed that ensemble approaches, particularly the Under Bagging technique, outperform traditional data re-balancing approaches, particularly when dealing with datasets that have high defect ratios.

Keywords: Software defect prediction; class imbalance; machine learning (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219622023500724
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijitdm:v:23:y:2024:i:06:n:s0219622023500724

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219622023500724

Access Statistics for this article

International Journal of Information Technology & Decision Making (IJITDM) is currently edited by Yong Shi

More articles in International Journal of Information Technology & Decision Making (IJITDM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:ijitdm:v:23:y:2024:i:06:n:s0219622023500724