EconPapers    
Economics at your fingertips  
 

Clustering of text documents with keyword weighting function

A. Christy, G. Meera Gandhi and S. Vaithyasubramanian

International Journal of Intelligent Enterprise, 2019, vol. 6, issue 1, 19-31

Abstract: In this digital world, data is available in abundance everywhere and it is growing at a phenomenal rate. Making data available readily for decision making is an important task of data analyst. In this article, we propose an unsupervised learning algorithm for text document clustering by adopting keyword weighting function. Documents are pre-processed and relevant keywords based on their weights are grouped together. Clustered keyword weighting (CKW) takes each class in the training collection as a known cluster, and searches for feature weights iteratively to optimise the clustering objective function, in order to retrieve the best clustering result. Performance of CKW is validated by clustering BBC news collection text collections. Experiments were conducted with simple K-means, hierarchical clustering algorithms and our keyword weighting and clustering approach has shown improved cluster quality compared to the other methods.

Keywords: documents; cluster; unsupervised; feature; K-means; normalised. (search for similar items in EconPapers)
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.inderscience.com/link.php?id=100029 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijient:v:6:y:2019:i:1:p:19-31

Access Statistics for this article

More articles in International Journal of Intelligent Enterprise from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:ijient:v:6:y:2019:i:1:p:19-31