EconPapers    
Economics at your fingertips  
 

Building Self-Healing Feature Based on Faster R-CNN Deep Learning Technique in Web Data Extraction Systems

Sudhir Kumar Patnaik and C. Narendra Babu ()
Additional contact information
Sudhir Kumar Patnaik: Department of Computer Science and Engineering, M.S. Ramaiah University of Applied Sciences, MSR Nagar, Bangalore, India
C. Narendra Babu: Department of Computer Science and Engineering, M.S. Ramaiah University of Applied Sciences, MSR Nagar, Bangalore, India

Journal of Information & Knowledge Management (JIKM), 2022, vol. 21, issue 02, 1-27

Abstract: Web data extraction has evolved over the years with extracting data from documents to today’s World Wide Web (WWW). The WWW growth has placed data at the centre of this ecosystem and benefited society at large, businesses and consumers. The proposed system uses deep learning technique, Faster region convolutional neural network (R-CNN) for automated navigation, extraction of data and self-healing of data extraction engine to adapt to dynamic changes in website layout. The proposed system trains the Faster R-CNN model for detection of product in the web page using bounding box image detection technique and extracts product details with high extraction accuracy. Deep learning technique has advanced rapidly in the different fields for image detection, but its application in data extraction makes this paper unique. An ecommerce retail website is used as real-world example to prove the self-healing capability of the proposed automated web data extraction system.

Keywords: Adaptive; data extraction; deep learning; Faster R-CNN; self-healing (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649222500290
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:21:y:2022:i:02:n:s0219649222500290

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219649222500290

Access Statistics for this article

Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh

More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:jikmxx:v:21:y:2022:i:02:n:s0219649222500290