Machine Learning Techniques in Web Content Mining: A Comparative Analysis
Basavaraj S. Anami,
Ramesh S. Wadawadagi () and
Veerappa B. Pagi
Additional contact information
Basavaraj S. Anami: KLE Institute of Technology, HUBLI, India
Ramesh S. Wadawadagi: Basaveshwar Engineering College, BAGALKOT, India
Veerappa B. Pagi: Basaveshwar Engineering College, BAGALKOT, India
Journal of Information & Knowledge Management (JIKM), 2014, vol. 13, issue 01, 1-12
Abstract:
With incessantly growing amount of information published over Web pages, the World Wide Web (WWW) has become prolific in the field of data mining research. The heterogeneous and semi-structured nature of Web data has made the process of automated discovery a challenging issue. Web Content Mining (WCM) essentially uses data mining techniques to effectively discover knowledge from Web page contents. The intent of this study is to provide a comparative analysis of Machine Learning (ML) techniques available in the literature for WCM. For analysis, the article focuses on issues such as representation techniques, learning methods, datasets used and performance of each method as a criterion. The survey observes that some of the traditional ML algorithms have been efficiently used to work on Web data. Finally, the paper concludes citing some promising issues for further research in this domain.
Keywords: Web mining; information retrieval; web content mining; web page classification; web page clustering; semantic web (search for similar items in EconPapers)
Date: 2014
References: View complete reference list from CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649214500051
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:13:y:2014:i:01:n:s0219649214500051
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219649214500051
Access Statistics for this article
Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh
More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().