An Approach for Focused Crawler to Harvest Digital Academic Documents in Online Digital Libraries
Sumita Gupta,
Neelam Duhan and
Poonam Bansal
Additional contact information
Sumita Gupta: YMCAUST, Faridabad, India
Neelam Duhan: J C Bose University of Science and Technology YMCA, Faridabad, India
Poonam Bansal: MSIT, Delhi, India
International Journal of Information Retrieval Research (IJIRR), 2019, vol. 9, issue 3, 23-47
Abstract:
With the rapid growth of digital information and user need, it becomes imperative to retrieve relevant and desired domain or topic specific documents as per the user query quickly. A focused crawler plays a vital role in digital libraries to crawl the web so that researchers can easily explore the domain specific search results list and find the desired content against the query. In this article, a focused crawler is being proposed for online digital library search engines, which considers meta-data of the query in order to retrieve the corresponding document or other relevant but missing information (e.g. paid publication from ACM, IEEE, etc.) against the user query. The different query strategies are made by using the meta-data and submitted to different search engines which aim to find more relevant information which is missing. The result comes out from these search engines are filtered and then used further for crawling the Web.
Date: 2019
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJIRR.2019070103 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:9:y:2019:i:3:p:23-47
Access Statistics for this article
International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu
More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().