EconPapers    
Economics at your fingertips  
 

Parallel Monte Carlo algorithms for information retrieval

V.N. Alexandrov, I.T. Dimov, A. Karaivanova and C.J.K. Tan

Mathematics and Computers in Simulation (MATCOM), 2003, vol. 62, issue 3, 289-295

Abstract: In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first “k” singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k×k matrix) from where we then find the first “k” triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the “k”-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one.

Keywords: Singular value decomposition; Stochastic methods; Data mining; Lanczos method; Eigenvalue computation (search for similar items in EconPapers)
Date: 2003
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0378475402002525
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:matcom:v:62:y:2003:i:3:p:289-295

DOI: 10.1016/S0378-4754(02)00252-5

Access Statistics for this article

Mathematics and Computers in Simulation (MATCOM) is currently edited by Robert Beauwens

More articles in Mathematics and Computers in Simulation (MATCOM) from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:matcom:v:62:y:2003:i:3:p:289-295