A Novel Architecture for Deep Web Crawler

Sharma, Dilip Kumar; Sharma, A. K.

A Novel Architecture for Deep Web Crawler

Dilip Kumar Sharma and A. K. Sharma
Additional contact information
Dilip Kumar Sharma: Shobhit University, India
A. K. Sharma: YMCA University of Science and Technology, India

International Journal of Information Technology and Web Engineering (IJITWE), 2011, vol. 6, issue 1, 25-48

Abstract: A traditional crawler picks up a URL, retrieves the corresponding page and extracts various links, adding them to the queue. A deep Web crawler, after adding links to the queue, checks for forms. If forms are present, it processes them and retrieves the required information. Various techniques have been proposed for crawling deep Web information, but much remains undiscovered. In this paper, the authors analyze and compare important deep Web information crawling techniques to find their relative limitations and advantages. To minimize limitations of existing deep Web crawlers, a novel architecture is proposed based on QIIIEP specifications (Sharma & Sharma, 2009). The proposed architecture is cost effective and has features of privatized search and general search for deep Web data hidden behind html forms.

Date: 2011
References: Add references at CitEc
Citations:

Downloads: (external link)
https://services.igi-global.com/resolvedoi/resolve ... 018/jitwe.2011010103 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jitwe0:v:6:y:2011:i:1:p:25-48

Access Statistics for this article

International Journal of Information Technology and Web Engineering (IJITWE) is currently edited by Ghazi I. Alkhatib

More articles in International Journal of Information Technology and Web Engineering (IJITWE) from IGI Global Scientific Publishing
Bibliographic data for series maintained by Journal Editor ().