Information-Theoretic Clustering and Algorithms

Uchiyama, Toshio

Information-Theoretic Clustering and Algorithms

Toshio Uchiyama

A chapter in Advances in Statistical Methodologies and Their Application to Real Problems from IntechOpen

Abstract: Clustering is the task of partitioning objects into clusters on the basis of certain criteria so that objects in the same cluster are similar. Many clustering methods have been proposed in a number of decades. Since clustering results depend on criteria and algorithms, appropriate selection of them is an essential problem. Recently, large sets of users' behavior logs and text documents are common. These are often presented as high-dimensional and sparse vectors. This chapter introduces information-theoretic clustering (ITC), which is appropriate and useful to analyze such a high-dimensional data, from both theoretical and experimental side. Theoretically, the criterion, generative models, and novel algorithms are shown. Experimentally, it shows the effectiveness and usefulness of ITC for text analysis as an important example.

Keywords: information-theoretic clustering; competitive learning; Kullback-Leibler divergence; Jensen-Shannon divergence; clustering algorithm; text analysis (search for similar items in EconPapers)
JEL-codes: C60 (search for similar items in EconPapers)
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.intechopen.com/chapters/53254 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ito:pchaps:109045

DOI: 10.5772/66588

Access Statistics for this chapter

More chapters in Chapters from IntechOpen
Bibliographic data for series maintained by Slobodan Momcilovic ().