Overlapping thematic structures extraction with mixed-membership stochastic blockmodel
Shuo Xu (),
Junwan Liu (),
Dongsheng Zhai (),
Xin An (),
Zheng Wang () and
Hongshen Pang ()
Additional contact information
Shuo Xu: Beijing University of Technology
Junwan Liu: Beijing University of Technology
Dongsheng Zhai: Beijing University of Technology
Xin An: Beijing Forestry University
Zheng Wang: Institute of Scientific and Technical Information of China
Hongshen Pang: Shenzhen University
Scientometrics, 2018, vol. 117, issue 1, No 5, 84 pages
Abstract:
Abstract It is increasing important to identify automatically thematic structures from massive scientific literature. The interdisciplinarity enables thematic structures without natural boundaries. In this work, the identification of thematic structures is regarded as an overlapping community detection problem from the large-scale citation-link network. A mixed-membership stochastic blockmodel, armed with stochastic variational inference algorithm, is utilized to detect the overlapping thematic structures. In the meanwhile, in order to enhance readability, each theme is labeled with soft mutual information based method by several topical terms. Extensive experimental results on the astro dataset indicate that mixed-membership stochastic blockmodel primarily uses the local information and allows for the pervasive overlaps, but it favors similar sized themes, which disqualifies this approach from being used to extract the thematic structures from scientific literature. In addition, the thematic structures from the bibliographic coupling network is similar to those from the co-citation network.
Keywords: Overlapping thematic structure; Mixed-membership stochastic blockmodel; Stochastic variational inference; Soft mutual information; Cluster labeling (search for similar items in EconPapers)
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (9)
Downloads: (external link)
http://link.springer.com/10.1007/s11192-018-2841-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:117:y:2018:i:1:d:10.1007_s11192-018-2841-4
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-018-2841-4
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().