Comparing semantic representation methods for keyword analysis in bibliometric research
Guo Chen,
Siqi Hong,
Chenxin Du,
Panting Wang,
Zeyu Yang and
Lu Xiao
Journal of Informetrics, 2024, vol. 18, issue 3
Abstract:
Semantic representation methods play a crucial role in text mining tasks. Although numerous approaches have been proposed and compared in text mining research, the comparison of semantic representation methods specifically for publication keywords in bibliometric studies has received limited attention. This lack of practical evidence makes it challenging for researchers to select suitable methods to obtain keyword vectors for downstream bibliometric tasks, potentially hindering the achievement of optimal results. To address this gap, this study conducts an experimental comparison of various typical semantic representation methods for keywords, aiming to provide quantitative evidence for bibliometric studies. The experiment focuses on keyword clustering as the fundamental task and evaluates 22 variations of five typical methods across four scientific domains. The methods compared are co-word matrix, co-word network, word embedding, network embedding, and “semantic + structure” integration. The comparison is based on fitting the clustering results of these methods with the “evaluation standard” specific to each domain. The empirical findings demonstrate that the co-word matrix exhibits subpar performance, whereas the co-word network and word embedding techniques display satisfactory performance. Among the five network embedding algorithms, LINE and Node2Vec outperform DeepWalk, Struc2Vec, and SDNE. Remarkably, both the “pre-training and fine-tuning” model and the “semantic + structure” model yield unsatisfactory results in terms of performance. Nevertheless, even with variations in the performance of these methods, no singular approach stands out as universally superior. When selecting methods in practical applications, comprehensive consideration of factors such as corpus size and semantic cohesion of domain keywords is crucial. This study advances our understanding of semantic representation methods for keyword analysis and contributes to the advancement of bibliometric analysis by providing valuable recommendations for researchers in selecting appropriate methods.
Keywords: Keyword analysis; Semantic representation; Co-word analysis; Co-word network; Word embedding; Network embedding (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1751157724000427
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:infome:v:18:y:2024:i:3:s1751157724000427
DOI: 10.1016/j.joi.2024.101529
Access Statistics for this article
Journal of Informetrics is currently edited by Leo Egghe
More articles in Journal of Informetrics from Elsevier
Bibliographic data for series maintained by Catherine Liu ().