Software Defect Prediction Analysis Using Machine Learning Techniques

Khalid, Aimen; Badshah, Gran; Ayub, Nasir; Shiraz, Muhammad; Ghouse, Mohamed

Software Defect Prediction Analysis Using Machine Learning Techniques

Aimen Khalid, Gran Badshah (), Nasir Ayub, Muhammad Shiraz and Mohamed Ghouse
Additional contact information
Aimen Khalid: Department of Computer Science, Federal Urdu University of Arts, Science and Technology Islamabad, Islamabad 44000, Pakistan
Gran Badshah: Department of Computer Science, College of Computer Science, King Khalid University Abha, Abha 61413, Saudi Arabia
Nasir Ayub: Department of Software Engineering, Faculty of Computing, Capital University of Science and Technology, Islamabad 44000, Pakistan
Muhammad Shiraz: Department of Computer Science, Federal Urdu University of Arts, Science and Technology Islamabad, Islamabad 44000, Pakistan
Mohamed Ghouse: Department of Computer Science, College of Computer Science, King Khalid University Abha, Abha 61413, Saudi Arabia

Sustainability, 2023, vol. 15, issue 6, 1-17

Abstract: There is always a desire for defect-free software in order to maintain software quality for customer satisfaction and to save testing expenses. As a result, we examined various known ML techniques and optimized ML techniques on a freely available data set. The purpose of the research was to improve the model performance in terms of accuracy and precision of the dataset compared to previous research. As previous investigations show, the accuracy can be further improved. For this purpose, we employed K-means clustering for the categorization of class labels. Further, we applied classification models to selected features. Particle Swarm Optimization is utilized to optimize ML models. We evaluated the performance of models through precision, accuracy, recall, f-measure, performance error metrics, and a confusion matrix. The results indicate that all the ML and optimized ML models achieve the maximum results; however, the SVM and optimized SVM models outperformed with the highest achieved accuracy, 99% and 99.80%, respectively. The accuracy of NB, Optimized NB, RF, Optimized RF and ensemble approaches are 93.90%, 93.80%, 98.70%, 99.50%, 98.80% and 97.60, respectively. In this way, we achieve maximum accuracy compared to previous studies, which was our goal.

Keywords: software defect prediction; machine learning; k-means clustering; support vector machine; naïve Bayes; random forest; ensemble approach; particle swarm optimization (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2071-1050/15/6/5517/pdf (application/pdf)
https://www.mdpi.com/2071-1050/15/6/5517/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:15:y:2023:i:6:p:5517-:d:1103190

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().