Spherical k-Means Clustering
Kurt Hornik,
Ingo Feinerer,
Martin Kober and
Christian Buchta
Journal of Statistical Software, 2012, vol. 050, issue i10
Abstract:
Clustering text documents is a fundamental task in modern data analysis, requiring approaches which perform well both in terms of solution quality and computational efficiency. Spherical k-means clustering is one approach to address both issues, employing cosine dissimilarities to perform prototype-based partitioning of term weight representations of the documents. This paper presents the theory underlying the standard spherical k-means problem and suitable extensions, and introduces the R extension package skmeans which provides a computational environment for spherical k-means clustering featuring several solvers: a fixed-point and genetic algorithm, and interfaces to two external solvers (CLUTO and Gmeans). Performance of these solvers is investigated by means of a large scale benchmark experiment.
Date: 2012-09-18
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (20)
Downloads: (external link)
https://www.jstatsoft.org/index.php/jss/article/view/v050i10/v50i10.pdf
https://www.jstatsoft.org/index.php/jss/article/do ... skmeans_0.2-3.tar.gz
https://www.jstatsoft.org/index.php/jss/article/do ... 0i10-replication.zip
https://www.jstatsoft.org/index.php/jss/article/do ... ks_2011.10.24.tar.gz
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:jss:jstsof:v:050:i10
DOI: 10.18637/jss.v050.i10
Access Statistics for this article
Journal of Statistical Software is currently edited by Bettina Grün, Edzer Pebesma and Achim Zeileis
More articles in Journal of Statistical Software from Foundation for Open Access Statistics
Bibliographic data for series maintained by Christopher F. Baum (baum@bc.edu).