Using Latent Semantic Indexing to Improve the Accuracy of Document Clustering
Jiaming Zhan () and
Han Tong Loh
Additional contact information
Jiaming Zhan: Department of Mechanical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore
Han Tong Loh: Department of Mechanical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore
Journal of Information & Knowledge Management (JIKM), 2007, vol. 06, issue 03, 181-188
Abstract:
Document clustering is a significant research issue in information retrieval and text mining. Traditionally, most clustering methods were based on the vector space model which has a few limitations such as high dimensionality and weakness in handling synonymous and polysemous problems. Latent semantic indexing (LSI) is able to deal with such problems to some extent. Previous studies have shown that using LSI could reduce the time in clustering a large document set while having little effect on clustering accuracy. However, when conducting clustering upon a small document set, the accuracy is more concerned than efficiency. In this paper, we demonstrate that LSI can improve the clustering accuracy of a small document set and we also recommend the dimensions needed to achieve the best clustering performance.
Keywords: Latent semantic indexing; vector space model; document clustering; information retrieval; text mining (search for similar items in EconPapers)
Date: 2007
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649207001755
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:06:y:2007:i:03:n:s0219649207001755
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219649207001755
Access Statistics for this article
Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh
More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().