Clustering Web Pages into Hierarchical Categories
Zhongmei Yao and
Ben Choi
Additional contact information
Zhongmei Yao: Louisiana Tech University, USA
Ben Choi: Louisiana Tech University, USA
International Journal of Intelligent Information Technologies (IJIIT), 2007, vol. 3, issue 2, 17-35
Abstract:
Clustering is well suited for Web mining by automatically organizing Web pages into categories each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain, until now there is no such a method suitable for Web page clustering. To address this problem, we discovered a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page datasets. We also propose a new Bidirectional Hierarchical Clustering algorithm, which arranges individual Web pages into clusters and then arranges the clusters into larger clusters and so on until the average inter-cluster similarity approaches the constant factor. Having the new constant factor together with the new algorithm, we have developed a clustering system suitable for mining the Web.
Date: 2007
References: Add references at CitEc
Citations:
Downloads: (external link)
https://services.igi-global.com/resolvedoi/resolve ... 4018/jiit.2007040102 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jiit00:v:3:y:2007:i:2:p:17-35
Access Statistics for this article
International Journal of Intelligent Information Technologies (IJIIT) is currently edited by Vijayan Sugumaran
More articles in International Journal of Intelligent Information Technologies (IJIIT) from IGI Global
Bibliographic data for series maintained by Journal Editor ().