Evaluation of clustering algorithms for word sense disambiguation

Broda, Bartosz; Mazur, Wojciech

Evaluation of clustering algorithms for word sense disambiguation

Bartosz Broda and Wojciech Mazur

International Journal of Data Analysis Techniques and Strategies, 2012, vol. 4, issue 3, 219-236

Abstract: Word sense disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly preparation of training data. This work focuses on evaluation of a few selected clustering algorithms in the task of word sense disambiguation. We used five datasets for two languages (English and Polish). Five clustering algorithms (k-means, k-medoids, hierarchical agglomerative clustering, hierarchical divisive clustering, graph-partitioning-based clustering) and two weighting schemes were tested. The best parameters of the algorithms were chosen using 5 × 2 cross validation. BCubed measure was employed for evaluation of clustering. We conclude that with these settings agglomerative hierarchical clustering achieves best results for all the datasets.

Keywords: clustering algorithms; word sense disambiguation; WSD; BCubed; senseval; bag of words; English; Polish. (search for similar items in EconPapers)
Date: 2012
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.inderscience.com/link.php?id=47817 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:4:y:2012:i:3:p:219-236

Access Statistics for this article

More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().