EconPapers    
Economics at your fingertips  
 

Text Clustering using Distances Combination by Social Bees: Towards 3D Visualisation Aspect

Hadj Ahmed Bouarara, Reda Mohamed Hamou and Abdelmalek Amine
Additional contact information
Hadj Ahmed Bouarara: GeCode Laboratory, Tahar Moulay University of Saida Algeria, Saida, Algeria
Reda Mohamed Hamou: Department of Computer Science, Tahar Moulay University of Saida, Algeria, Saida, Algeria
Abdelmalek Amine: Tahar Moulay University of Saida Algeria, Saida, Algeria

International Journal of Information Retrieval Research (IJIRR), 2014, vol. 4, issue 3, 34-53

Abstract: Recently, the researchers proved that 90% of the information existed on the web, were presented in unstructured format (text free). The automatic text classification (clustering), has become a crucial challenge in the computer science community, where Most of the classical techniques, have known different problems in terms of time execution, multiplicity of data (marketing, biology, economics), and the initialization of cluster number. Nowadays, the bio-inspired paradigm, has known a genuine success in several sectors and particularly in the world of data-mining. The content of our work, is a novel approach called distances combination by social bees (DC-SB) for text clustering, composed of four steps: Pre-processing using different methods of texts representation (bag of words and n-gram characters) and the weighting TF-IDF, for the construction of the vectors; Bees' artificial life, the authors have imitated the functioning of social bees using three artificial worker bees(cleaner, guardian and forager) where each one of them is characterized by a distance measure different to others generated from the artificial queen (centroid) of the cluster (hive); Clustering using the concept of filtering where each filter is controlled by an artificial worker, and a document must pass three different obstacles to be added to the cluster. For the experiments they use the benchmark Reuters 21578 and a variety of validation tools (execution time f-measure and entropy) with a variation of parameters (threshold, distance measures combination and texts representation). The authors have compared their results with the performances of other methods existed in literature (Cellular Automata 2D, Artificial Immune System (AIS) and Artificial Social Spiders (ASS)), the conclusion obtained prove that the approach can solve the text clustering problem; finally, the visualization step, which provides a 3D navigation of the results obtained by the mean of a global and detailed view of the hive and the apiary, using the functionality of zooming and rotation.

Date: 2014
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJIRR.2014070103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:4:y:2014:i:3:p:34-53

Access Statistics for this article

International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu

More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jirr00:v:4:y:2014:i:3:p:34-53