Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment
Yonghe Lu (),
Jiayi Luo (),
Ying Xiao () and
Hou Zhu ()
Additional contact information
Yonghe Lu: Sun Yat-sen University
Jiayi Luo: Sun Yat-sen University
Ying Xiao: Sun Yat-sen University
Hou Zhu: Sun Yat-sen University
Scientometrics, 2021, vol. 126, issue 8, No 25, 6937-6963
Abstract:
Abstract Text representation is the preliminary work for in-depth analysis and mining of information in scientific papers. It directly affects the effects of downstream tasks such as, scientific papers classification, clustering, and similarity calculation. However, recent researches mainly considered citation network and partial structural information, which is insufficient when representing scientific papers. Therefore, in order to improve the performance of text representation model, this paper proposed MV-HATrans, a text representation model that combines multi-viewpoint information, such as the semantic information of knowledge graph and structural information. This model extracts word information from three aspects, including contextual content, part of speech, and word meaning of WordNet. Based on combination of hierarchical attention mechanism and transformer, the model achieves the full text representation of scientific papers. Finally, this paper uses the binary experimental dataset AAPR, which indicates whether scientific papers are accepted or not, and applies the proposed model of text representation to achieve the goal of automatic quality assessment. Results show that in the quality classification of scientific papers, adopting part-of-speech information and semantic information based on WordNet definitions can effectively achieve the accuracy of prediction as 70.14%. Among all the structural modules, authors and abstracts contributes the most to the quality classification of scientific papers, especially authors as 9.51%.
Keywords: Scientific papers; Text representation; Multi-viewpoint information; Part-of-speech; Word meaning (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s11192-021-04028-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:126:y:2021:i:8:d:10.1007_s11192-021-04028-4
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-021-04028-4
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().