EconPapers    
Economics at your fingertips  
 

Using Latent Semantic Indexing to Improve the Accuracy of Document Clustering

Jiaming Zhan () and Han Tong Loh
Additional contact information
Jiaming Zhan: Department of Mechanical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore
Han Tong Loh: Department of Mechanical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore

Journal of Information & Knowledge Management (JIKM), 2007, vol. 06, issue 03, 181-188

Abstract: Document clustering is a significant research issue in information retrieval and text mining. Traditionally, most clustering methods were based on the vector space model which has a few limitations such as high dimensionality and weakness in handling synonymous and polysemous problems. Latent semantic indexing (LSI) is able to deal with such problems to some extent. Previous studies have shown that using LSI could reduce the time in clustering a large document set while having little effect on clustering accuracy. However, when conducting clustering upon a small document set, the accuracy is more concerned than efficiency. In this paper, we demonstrate that LSI can improve the clustering accuracy of a small document set and we also recommend the dimensions needed to achieve the best clustering performance.

Keywords: Latent semantic indexing; vector space model; document clustering; information retrieval; text mining (search for similar items in EconPapers)
Date: 2007
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649207001755
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:06:y:2007:i:03:n:s0219649207001755

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219649207001755

Access Statistics for this article

Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh

More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:jikmxx:v:06:y:2007:i:03:n:s0219649207001755