Software Defect Prediction Through a Hybrid Approach Comprising of a Statistical Tool and a Machine Learning Model
Ashis Kumar Chakraborty and
Barin Karmakar ()
Additional contact information
Ashis Kumar Chakraborty: ISI Kolkata
Barin Karmakar: ISI Kolkata
Chapter Chapter 1 in Applications of Operational Research in Business and Industries, 2023, pp 1-19 from Springer
Abstract:
Abstract Traditional statistical learning algorithms perform poorly in case of learning from an imbalanced dataset. Software defect prediction (SDP) is a useful way to identify defects in the primary phases of the software development life cycle. This SDP methodology will help to remove software defects and induce to build a cost-effective and good quality of software products. Several statistical and machine learning models have been employed to predict defects in software modules. But the imbalanced nature of this type of datasets is one of the key characteristics, which needs to be exploited, for the successful development of a defect prediction model. Imbalanced software datasets contain non-uniform class distributions with most of the instances belonging to a specific class compared to that of the other class. We propose a novel hybrid model based on Hellinger distance-based decision tree (HDDT) and artificial neural network (ANN), which we call as hybrid HDDT-ANN model, for analysis of software defect prediction (SDP) data. This is a newly developed model which is found to be quite effective in predicting software bugs. A comparative study of several supervised machine learning models with our proposed model using different performance measures is also produced. Hybrid HDDT-ANN also takes care of the strength of a skew-insensitive distance measure, known as Hellinger distance, in handling class imbalance problems. A detailed experiment was performed over ten NASA SDP datasets to prove the superiority of the proposed method.
Keywords: Software defect prediction; Class imbalance; Hellinger distance; Artificial neural network; Hybrid model (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:lnopch:978-981-19-8012-1_1
Ordering information: This item can be ordered from
http://www.springer.com/9789811980121
DOI: 10.1007/978-981-19-8012-1_1
Access Statistics for this chapter
More chapters in Lecture Notes in Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().