EconPapers    
Economics at your fingertips  
 

Performance Evaluation of Missing-Value Imputation Clustering Based on a Multivariate Gaussian Mixture Model

Jing Xiao, Qiongqiong Xu, Chuanli Wu, Yuexia Gao, Tianqi Hua and Chenwu Xu

PLOS ONE, 2016, vol. 11, issue 8, 1-14

Abstract: Background: It is challenging to deal with mixture models when missing values occur in clustering datasets. Methods and Results: We propose a dynamic clustering algorithm based on a multivariate Gaussian mixture model that efficiently imputes missing values to generate a “pseudo-complete” dataset. Parameters from different clusters and missing values are estimated according to the maximum likelihood implemented with an expectation-maximization algorithm, and multivariate individuals are clustered with Bayesian posterior probability. A simulation showed that our proposed method has a fast convergence speed and it accurately estimates missing values. Our proposed algorithm was further validated with Fisher’s Iris dataset, the Yeast Cell-cycle Gene-expression dataset, and the CIFAR-10 images dataset. The results indicate that our algorithm offers highly accurate clustering, comparable to that using a complete dataset without missing values. Furthermore, our algorithm resulted in a lower misjudgment rate than both clustering algorithms with missing data deleted and with missing-value imputation by mean replacement. Conclusion: We demonstrate that our missing-value imputation clustering algorithm is feasible and superior to both of these other clustering algorithms in certain situations.

Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0161112 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 61112&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0161112

DOI: 10.1371/journal.pone.0161112

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone (plosone@plos.org).

 
Page updated 2025-03-19
Handle: RePEc:plo:pone00:0161112