Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors
Fabian Meyer-Brötz (),
Edgar Schiebel and
Leo Brecht
Additional contact information
Fabian Meyer-Brötz: University of Ulm
Edgar Schiebel: AIT Austrian Institute of Technology GmbH
Leo Brecht: University of Ulm
Scientometrics, 2017, vol. 111, issue 3, No 5, 1307-1325
Abstract:
Abstract The ongoing discussion in the bibliometric community about the best similarity measures has led to diverse insights. Although these insights are sometimes contradicting, there is one very consistent conclusion: Hybrid measures outperform the application of their singular components. While this initially answers the question as to what is the best similarity measure, it also raises issues which have been resolved in part for conventional similarity measures. Given this, in this study we investigate the impact of the right weighting factors, the appropriate level of edge cutting, the performance of first- in contrast to second-order similarities, and the interaction of these three parameters in the context of hybrid similarities. Building upon a dataset of over 8000 articles from the manufacturing engineering field and using different parameter settings we calculated over 100 similarity matrices. For each matrix we determined several cluster solutions of different resolution levels, ranging from 100 to 1000 clusters, and evaluated them quantitatively with the help of a textual coherence value based on the Jensen Shannon Divergence. We found that second-order hybrid similarity measures calculated with a weighting factor of 0.6 for the citation-based similarity and a reduction to only the strongest values yield the best clustering results. Furthermore, we found the assessed parameters to be highly interdependent, where for example hybrid first-order outperforms second-order when no edge cutting is applied. Given this, our results can serve the bibliometric community as a guideline for the appropriate application of hybrid measures.
Keywords: Hybrid clustering; Bibliographic coupling; Textual coherence; Similarity measures; First- and second-order similarity (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
http://link.springer.com/10.1007/s11192-017-2366-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:111:y:2017:i:3:d:10.1007_s11192-017-2366-2
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-017-2366-2
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().