A Novel Architecture for Deep Web Crawler
Dilip Kumar Sharma and
A. K. Sharma
Additional contact information
Dilip Kumar Sharma: Shobhit University, India
A. K. Sharma: YMCA University of Science and Technology, India
International Journal of Information Technology and Web Engineering (IJITWE), 2011, vol. 6, issue 1, 25-48
Abstract:
A traditional crawler picks up a URL, retrieves the corresponding page and extracts various links, adding them to the queue. A deep Web crawler, after adding links to the queue, checks for forms. If forms are present, it processes them and retrieves the required information. Various techniques have been proposed for crawling deep Web information, but much remains undiscovered. In this paper, the authors analyze and compare important deep Web information crawling techniques to find their relative limitations and advantages. To minimize limitations of existing deep Web crawlers, a novel architecture is proposed based on QIIIEP specifications (Sharma & Sharma, 2009). The proposed architecture is cost effective and has features of privatized search and general search for deep Web data hidden behind html forms.
Date: 2011
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/jitwe.2011010103 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jitwe0:v:6:y:2011:i:1:p:25-48
Access Statistics for this article
International Journal of Information Technology and Web Engineering (IJITWE) is currently edited by Ghazi I. Alkhatib
More articles in International Journal of Information Technology and Web Engineering (IJITWE) from IGI Global
Bibliographic data for series maintained by Journal Editor ().