A Graph-Based Keyword Extraction Method for Academic Literature Knowledge Graph Construction
Lin Zhang (),
Yanan Li and
Qinru Li
Additional contact information
Lin Zhang: School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China
Yanan Li: School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China
Qinru Li: School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China
Mathematics, 2024, vol. 12, issue 9, 1-25
Abstract:
In this paper, we construct an academic literature knowledge graph based on the relationship between documents to facilitate the storage and research of academic literature data. Keywords are an important type of node in the knowledge graph. To solve the problem that there are no keywords in some documents for several reasons in the process of knowledge graph construction, an improved keyword extraction algorithm called TP-CoGlo-TextRank is proposed by using word frequency, position, word co-occurrence frequency, and a word embedding model. By combining the word frequency and position in the document, the importance of words is distinguished. By introducing the GloVe word-embedding model, which brings the external knowledge of documents into the TextRank algorithm, and combining the internal word co-occurrence frequency in the documents, the word-adjacency relationship is transferred non-uniformly. Finally, the words with the highest scores are combined into phrases if they are adjacent in the original text. The validity of the TP-CoGlo-TextRank algorithm is verified by experiments. On this basis, the Neo4j graph database is used to store and display the academic literature knowledge graph, to provide data support for research tasks such as text clustering, automatic summarization, and question-answering systems.
Keywords: keyword extraction; TextRank; word embedding; text statistical features; academic literature knowledge graph (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/9/1349/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/9/1349/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:9:p:1349-:d:1385504
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().