Document keyword extraction based on semantic hierarchical graph model
Tingting Zhang,
Baozhen Lee (),
Qinghua Zhu,
Xi Han and
Ke Chen
Additional contact information
Tingting Zhang: Nanjing Audit University
Baozhen Lee: Nanjing Audit University
Qinghua Zhu: Nanjing University
Xi Han: Guangdong University of Finance and Economics
Ke Chen: Nanjing Audit University
Scientometrics, 2023, vol. 128, issue 5, No 1, 2623-2647
Abstract:
Abstract Keyword provide a brief profile of document contents and serve as an important method for quickly obtaining the document’s themes. Traditional keyword extraction methods are mostly based on statistical relationships between words, with no deeper understanding of the words’ structures. In addition, most studies to date performing keyword extraction are based on ranking-related measure values, without considering the cohesion of the extracted keyword set. In this paper, a keyword extraction method based on a semantic hierarchical graph model is proposed. First, the semantic graph for the document is constructed based on the hierarchical extraction of feature terms. Then, the keyword collection of the document is chosen from the constructed semantic graph. The keyword extraction method in this paper fully accounts for both the context of the keywords and the internal structure by which they are related. By mining the deep hidden structure of feature terms, the proposed method can effectively reveal the hierarchical association between terms within the semantic graph and obtain a keyword collection result with high probability. Moreover, several experiments conducted on released datasets show that our method outperforms the existing methods in terms of precision, recall, and F-measure.
Keywords: Semantic hierarchical graph; Keyword extraction; Feature terms; Text mining (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11192-023-04677-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:128:y:2023:i:5:d:10.1007_s11192-023-04677-7
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-023-04677-7
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().