A new network model for extracting text keywords
Liu Yang (),
Keping Li and
Additional contact information
Liu Yang: Beijing Jiaotong University
Keping Li: Beijing Jiaotong University
Hangfei Huang: Beijing Jiaotong University
Scientometrics, 2018, vol. 116, issue 1, 339-361
Abstract Text keywords are defined as meaningful and important words in a document, which provide a precise overview of its content and reflect the author’s writing intention. Keyword extraction methods have received a lot of attentions, among which is the network-based method. However, existing network-based keyword extraction methods only consider the connections between words in a document, while ignoring the impact of sentences. Since a sentence is made of many words, while words affect one another in a sentence, neglecting the influence of sentences will result in the loss of information. In this paper, we introduce a word network whose nodes represent words in a document, and define that any keyword extraction method based on a word network is called as a Word-net method. Then, we propose a new network model which considers the influence of sentences, and a new word-sentence method based on the new model. Experimental results demonstrate that our method outperforms the Word-net method, the classical term frequency-inverse document frequency (TF-IDF) method, most frequent method and TextRank method. The precision, recall, and F-measure of our result are respectively 7.95, 8.27 and 6.54% higher than the Word-net result, and the average precision of our result is 17.56% higher than the TF-IDF result. A two-way analysis of variance is employed to validate the empirical analysis, which indicates that keyword extraction methods and keyword numbers have statistically significant effects on the evaluation of metric values.
Keywords: Keyword extraction; Complex network; Synthetic eigenvalue; Text keyword; Network theory (search for similar items in EconPapers)
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
http://link.springer.com/10.1007/s11192-018-2743-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:116:y:2018:i:1:d:10.1007_s11192-018-2743-5
Ordering information: This journal article can be ordered from
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla ().