EconPapers    
Economics at your fingertips  
 

Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering

Divya Teja Ravoori and Zhengxin Chen
Additional contact information
Divya Teja Ravoori: Department of Computer Science, University of Nebraska at Omaha, Omaha, NE, USA
Zhengxin Chen: Department of Computer Science, University of Nebraska at Omaha, Omaha, NE, USA

International Journal of Information Retrieval Research (IJIRR), 2015, vol. 5, issue 2, 26-42

Abstract: Inspired by the success of a recently developed algorithm MVSC-IR, the authors embed the idea of Multi-Viewpoint Based Similarity Measure for clustering (MVSC) into a hierarchical clustering method, i.e., average linkage clustering, to overcome the problem of initiation with random seeds, resulting in a new algorithm, referred to as MVSC-HAC. The improved performance of this new algorithm encouraged us to further explore the impact of metadata in document clustering. In this paper, after reviewing two existing algorithms, the authors describe their new algorithm and present experimental results on various sizes of data sets at two different levels: the one using the entire context of documents and the one using existing meta tags of the documents. The result shows MVSC-HAC excels at both levels. The authors analyze the results, and provide a discussion based on other observation on the role of metadata in document clustering.

Date: 2015
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJIRR.2015040102 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jirr00:v:5:y:2015:i:2:p:26-42

Access Statistics for this article

International Journal of Information Retrieval Research (IJIRR) is currently edited by Zhongyu Lu

More articles in International Journal of Information Retrieval Research (IJIRR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jirr00:v:5:y:2015:i:2:p:26-42