Dynamic Clustering Based on Minimum Spanning Tree and Context Similarity for Enhancing Document Classification
Anirban Chakrabarty and
Sudipta Roy
Additional contact information
Anirban Chakrabarty: Department of Computer Applications, Future Institute of Engineering and Management, Sonarpur, Kolkata, India
Sudipta Roy: Triguna Sen School of Information Technology, Assam Central University, Silchar, Assam, India
International Journal of Information Retrieval Research (IJIRR), 2014, vol. 4, issue 1, 46-60
Abstract:
Document Classification is the task of assigning a text document to one or more predefined categories according to its content and the labeled training samples. Traditional classification schemes use all training samples for classification, thereby increasing storage requirements and calculation complexity as the number of features increase. Moreover, the commonly used classification techniques consider the number of categories is known in advance, this may not be so in actual reality. In the practical scenario, it is very much essential to find the number of clusters for unknown dataset dynamically. Identifying these limitations, the proposed work evolves a text clustering algorithm where clusters are generated dynamically based on minimum spanning tree incorporating semantic features. The proposed model can efficiently find the significant matching concepts between documents and can perform multi category classification. The formal analysis is supported by applications to email and cancer data sets. The cluster quality and accuracy values were compared with some of the widely used text clustering techniques which showed the efficiency of the proposed approach.
Date: 2014
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/ijirr.2014010103 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:4:y:2014:i:1:p:46-60
Access Statistics for this article
International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu
More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().