Software Defect Prediction with Bayesian Approaches

Hernández-Molinos, María José; Sánchez-García, Angel J.; Barrientos-Martínez, Rocío Erandi; Pérez-Arriaga, Juan Carlos; Ocharán-Hernández, Jorge Octavio

Software Defect Prediction with Bayesian Approaches

María José Hernández-Molinos, Angel J. Sánchez-García (), Rocío Erandi Barrientos-Martínez, Juan Carlos Pérez-Arriaga and Jorge Octavio Ocharán-Hernández
Additional contact information
María José Hernández-Molinos: Facultad de Estadística e Informática, Universidad Veracruzana, Xalapa 91020, Veracruz, Mexico
Angel J. Sánchez-García: Facultad de Estadística e Informática, Universidad Veracruzana, Xalapa 91020, Veracruz, Mexico
Rocío Erandi Barrientos-Martínez: Instituto de Investigaciones en Inteligencia Artificial, Universidad Veraruzana, Xalapa 91097, Veracruz, Mexico
Juan Carlos Pérez-Arriaga: Facultad de Estadística e Informática, Universidad Veracruzana, Xalapa 91020, Veracruz, Mexico
Jorge Octavio Ocharán-Hernández: Facultad de Estadística e Informática, Universidad Veracruzana, Xalapa 91020, Veracruz, Mexico

Mathematics, 2023, vol. 11, issue 11, 1-18

Abstract: Software defect prediction is an important area in software engineering because it helps developers identify and fix problems before they become costly and hard-to-fix bugs. Early detection of software defects helps save time and money in the software development process and ensures the quality of the final product. This research aims to evaluate three algorithms to build Bayesian Networks to classify whether a project is prone to defects. The choice is based on the fact that the most used approach in the literature is Naive Bayes, but no works use Bayesian Networks. Thus, K2, Hill Climbing, and TAN are used to construct Bayesian Networks. On the other hand, three public PROMISE data sets are used based on McCabe and Halstead complexity metrics. The results are compared with the most used approaches in the literature, such as Decision Tree and Random Forest. The results from different performance metrics applied to a cross-validation process show that the classification results are comparable to Decision Tree and Random Forest, with the advantage that Bayesian algorithms show less variability, which helps engineering software to have greater robustness in their predictions since the selection of training and test data do not give variable results, unlike Decision Tree and Random Forest.

Keywords: software defect prediction; Bayesian Networks; classification; machine learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/11/2524/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/11/2524/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:11:p:2524-:d:1160502

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().