On multivariate binary data clustering and feature weighting
Nizar Bouguila
Computational Statistics & Data Analysis, 2010, vol. 54, issue 1, 120-134
Abstract:
This paper presents an approach that partitions data sets of unlabeled binary vectors without a priori information about the number of clusters or the saliency of the features. The unsupervised binary feature selection problem is approached using finite mixture models of multivariate Bernoulli distributions. Using stochastic complexity, the proposed model determines simultaneously the number of clusters in a given data set composed of binary vectors and the saliency of the features used. We conduct different applications involving real data, document classification and images categorization to show the merits of the proposed approach.
Date: 2010
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00261-8
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:54:y:2010:i:1:p:120-134
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().