EconPapers    
Economics at your fingertips  
 

Q3-D3-LSA

Lukas Borke and Wolfgang Härdle

No 2016-049, SFB 649 Discussion Papers from Humboldt University Berlin, Collaborative Research Center 649: Economic Risk

Abstract: QuantNet 1 is an integrated web-based environment consisting of different types of statistics-related documents and program codes. Its goal is creating reproducibility and offering a platform for sharing validated knowledge native to the social web. To increase the information retrieval (IR) efficiency there is a need for incorporating semantic information. Three text mining models will be examined: vector space model (VSM), generalized VSM (GVSM) and latent semantic analysis (LSA). The LSA has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between documents. Our results show that different model configurations allow adapted similarity-based document clustering and knowledge discovery. In particular, different LSA configurations together with hierarchical clustering reveal good results under M3 evaluation. QuantNet and the corresponding Data-Driven Documents (D3) based visualization can be found and applied under http://quantlet.de. The driving technology behind it is Q3-D3-LSA, which is the combination of 'GitHub API based QuantNet Mining infrastructure in R', LSA and D3 implementation.

Keywords: QuantNet; D3; GitHub API; text mining; document clustering; similarity; semantic web; generalized vector space model; LSA; visualization (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.econstor.eu/bitstream/10419/148886/1/875027504.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:zbw:sfb649:sfb649dp2016-049

Access Statistics for this paper

More papers in SFB 649 Discussion Papers from Humboldt University Berlin, Collaborative Research Center 649: Economic Risk Contact information at EDIRC.
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().

 
Page updated 2025-03-31
Handle: RePEc:zbw:sfb649:sfb649dp2016-049