A novel approach for imputation of missing continuous attribute values in databases using genetic algorithm
R. Devi Priya and
S. Kuppuswami
International Journal of Information Technology and Management, 2015, vol. 14, issue 2/3, 185-200
Abstract:
Missing values in databases are more common and if untreated distort the estimates. Numerous methods were developed by researchers to replace the missing values in continuous attributes. The simple methods used are less efficient and the efficient methods are very complex to implement. Hence, to maintain a balance between simplicity and efficiency a new method called Bayesian genetic algorithm (BGA) is proposed based on genetic algorithm and Bayes theorem for both missing at random (MAR) and missing completely at random (MCAR) assumption. Accuracy of BGA is compared with that of mean, kNN and multiple imputation in finding the missing values and the results are studied. BGA produces more accurate results than other methods in four datasets studied at different rates of missingness ranging from 5% to 60%. BGA works better even in large datasets resulting in less biased estimates.
Keywords: continuous attributes; missing values; Bayesian genetic algorithms; BGA; missing at random; MAR; missing completely at random; MCAR; databases. (search for similar items in EconPapers)
Date: 2015
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=68461 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijitma:v:14:y:2015:i:2/3:p:185-200
Access Statistics for this article
More articles in International Journal of Information Technology and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().