Discovering optimal clusters using firefly algorithm
Athraa Jasim Mohammed,
Yuhanis Yusof and
Husniza Husni
International Journal of Data Mining, Modelling and Management, 2016, vol. 8, issue 4, 330-347
Abstract:
Existing conventional clustering techniques require a pre-determined number of clusters, unluckily; missing information about real world problem makes it a hard challenge. A new orientation in data clustering is to automatically cluster a given set of items by identifying the appropriate number of clusters and the optimal centre for each cluster. In this paper, we present the WFA_selection algorithm that originates from weight-based firefly algorithm. The newly proposed WFA_selection merges selected clusters in order to produce a better quality of clusters. Experiments utilising the WFA and WFA_selection algorithms were conducted on the 20Newsgroups and Reuters-21578 benchmark dataset and the output were compared against bisect K-means and general stochastic clustering method (GSCM). Results demonstrate that the WFA_selection generates a more robust and compact clusters as compared to the WFA, bisect K-means and GSCM.
Keywords: partitional clustering; dynamic clustering; hierarchical clustering; text clustering; firefly algorithm; cluster discovering; optimal clusters; data clustering; bisect K-means clustering; general stochastic clustering. (search for similar items in EconPapers)
Date: 2016
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.inderscience.com/link.php?id=81239 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:8:y:2016:i:4:p:330-347
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().