Binary clustering with missing data
M. Nadif and
G. Govaert
Applied Stochastic Models and Data Analysis, 1993, vol. 9, issue 1, 59-71
Abstract:
A clustering method is presented for analysing multivariate binary data with missing values. When not all values are observed, Govaert3 has studied the relations between clustering methods and statistical models. The author has shown how the identification of a mixture of Bernoulli distributions with the same parameter for all clusters and for all variables corresponds to a clustering criterion which uses L1 distance characterizing the MNDBIN method (Marchetti8). He first generalized this model by selecting parameters which can depend on variables and finally by selecting parameters which can depend both on variables and on clusters. We use the previous models to derive a clustering method adapted to missing data. This method optimizes a criterion by a standard iterative partitioning algorithm which removes the necessity either to ignore objects or to substitute the missing data. We study several versions of this algorithm and, finally, a brief account is given of the application of this method to some simulated data.
Date: 1993
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asm.3150090105
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:apsmda:v:9:y:1993:i:1:p:59-71
Access Statistics for this article
More articles in Applied Stochastic Models and Data Analysis from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().