EconPapers    
Economics at your fingertips  
 

Document–document similarity approaches and science mapping: Experimental comparison of five approaches

Per Ahlgren and Cristian Colliander

Journal of Informetrics, 2009, vol. 3, issue 1, 49-63

Abstract: This paper treats document–document similarity approaches in the context of science mapping. Five approaches, involving nine methods, are compared experimentally. We compare text-based approaches, the citation-based bibliographic coupling approach, and approaches that combine text-based approaches and bibliographic coupling. Forty-three articles, published in the journal Information Retrieval, are used as test documents. We investigate how well the approaches agree with a ground truth subject classification of the test documents, when the complete linkage method is used, and under two types of similarities, first-order and second-order. The results show that it is possible to achieve a very good approximation of the classification by means of automatic grouping of articles. One text-only method and one combination method, under second-order similarities in both cases, give rise to cluster solutions that to a large extent agree with the classification.

Keywords: Citation data; Textual data; Data source combination; Cluster analysis; Science mapping (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (23)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1751157708000680
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:infome:v:3:y:2009:i:1:p:49-63

DOI: 10.1016/j.joi.2008.11.003

Access Statistics for this article

Journal of Informetrics is currently edited by Leo Egghe

More articles in Journal of Informetrics from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:infome:v:3:y:2009:i:1:p:49-63