Using SVD for text classification
Aliya Nugumanova () and
Yerzhan Baiburin ()
Additional contact information
Aliya Nugumanova: D. Serikbayev East Kazakhstan State Technical University
Yerzhan Baiburin: D. Serikbayev East Kazakhstan State Technical University
No 702094, Proceedings of International Academic Conferences from International Institute of Social and Economic Sciences
Abstract:
Singular value decomposition (SVD) is a way to decompose a matrix into some successive approximation. This decomposition can reveal internal structure of the matrix. The method is very useful for text mining. Usually co-occurrence matrix (terms-by-documents matrix) defined over a large corpus of text documents contains a lot of noise. Singular value decomposition allows approximation of the co-occurrence matrix and thereby can reveal internal (latent) structure of text corpus. It decreases information noise, removes the unnecessary (random) links between terms and increases the value of important information. In this paper we apply singular value decomposition to improve text classification. We build co-occurrence matrix and then approximate it by SVD. Obtained matrix is very useful for creating new feature space. We prove our approach by experiments on Reuters Text Classification Collection.
Keywords: SVD; text classification; text mining (search for similar items in EconPapers)
JEL-codes: C02 (search for similar items in EconPapers)
Pages: 1 page
Date: 2014-10
References: Add references at CitEc
Citations:
Published in Proceedings of the Proceedings of the 12th International Academic Conference, Prague, Oct 2014, pages 857-857
Downloads: (external link)
https://iises.net/proceedings/12th-international-a ... id=7&iid=99&rid=2094 First version, 2014
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sek:iacpro:0702094
Access Statistics for this paper
More papers in Proceedings of International Academic Conferences from International Institute of Social and Economic Sciences
Bibliographic data for series maintained by Klara Cermakova ().