EconPapers    
Economics at your fingertips  
 

Software Defect Prediction Through a Hybrid Approach Comprising of a Statistical Tool and a Machine Learning Model

Ashis Kumar Chakraborty and Barin Karmakar ()
Additional contact information
Ashis Kumar Chakraborty: ISI Kolkata
Barin Karmakar: ISI Kolkata

Chapter Chapter 1 in Applications of Operational Research in Business and Industries, 2023, pp 1-19 from Springer

Abstract: Abstract Traditional statistical learning algorithms perform poorly in case of learning from an imbalanced dataset. Software defect prediction (SDP) is a useful way to identify defects in the primary phases of the software development life cycle. This SDP methodology will help to remove software defects and induce to build a cost-effective and good quality of software products. Several statistical and machine learning models have been employed to predict defects in software modules. But the imbalanced nature of this type of datasets is one of the key characteristics, which needs to be exploited, for the successful development of a defect prediction model. Imbalanced software datasets contain non-uniform class distributions with most of the instances belonging to a specific class compared to that of the other class. We propose a novel hybrid model based on Hellinger distance-based decision tree (HDDT) and artificial neural network (ANN), which we call as hybrid HDDT-ANN model, for analysis of software defect prediction (SDP) data. This is a newly developed model which is found to be quite effective in predicting software bugs. A comparative study of several supervised machine learning models with our proposed model using different performance measures is also produced. Hybrid HDDT-ANN also takes care of the strength of a skew-insensitive distance measure, known as Hellinger distance, in handling class imbalance problems. A detailed experiment was performed over ten NASA SDP datasets to prove the superiority of the proposed method.

Keywords: Software defect prediction; Class imbalance; Hellinger distance; Artificial neural network; Hybrid model (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lnopch:978-981-19-8012-1_1

Ordering information: This item can be ordered from
http://www.springer.com/9789811980121

DOI: 10.1007/978-981-19-8012-1_1

Access Statistics for this chapter

More chapters in Lecture Notes in Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-01
Handle: RePEc:spr:lnopch:978-981-19-8012-1_1