EconPapers    
Economics at your fingertips  
 

Machine Learning Techniques in Web Content Mining: A Comparative Analysis

Basavaraj S. Anami, Ramesh S. Wadawadagi () and Veerappa B. Pagi
Additional contact information
Basavaraj S. Anami: KLE Institute of Technology, HUBLI, India
Ramesh S. Wadawadagi: Basaveshwar Engineering College, BAGALKOT, India
Veerappa B. Pagi: Basaveshwar Engineering College, BAGALKOT, India

Journal of Information & Knowledge Management (JIKM), 2014, vol. 13, issue 01, 1-12

Abstract: With incessantly growing amount of information published over Web pages, the World Wide Web (WWW) has become prolific in the field of data mining research. The heterogeneous and semi-structured nature of Web data has made the process of automated discovery a challenging issue. Web Content Mining (WCM) essentially uses data mining techniques to effectively discover knowledge from Web page contents. The intent of this study is to provide a comparative analysis of Machine Learning (ML) techniques available in the literature for WCM. For analysis, the article focuses on issues such as representation techniques, learning methods, datasets used and performance of each method as a criterion. The survey observes that some of the traditional ML algorithms have been efficiently used to work on Web data. Finally, the paper concludes citing some promising issues for further research in this domain.

Keywords: Web mining; information retrieval; web content mining; web page classification; web page clustering; semantic web (search for similar items in EconPapers)
Date: 2014
References: View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649214500051
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:13:y:2014:i:01:n:s0219649214500051

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219649214500051

Access Statistics for this article

Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh

More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:jikmxx:v:13:y:2014:i:01:n:s0219649214500051