A time-series based aggregation scheme for topic detection in Weibo short texts
Tinghuai Ma,
Jing Li,
Xinnian Liang,
Yuan Tian,
Abdullah Al-Dhelaan and
Mohammed Al-Dhelaan
Physica A: Statistical Mechanics and its Applications, 2019, vol. 536, issue C
Abstract:
Discovering hot topics within social network like Twitter and Weibo, has received much attention in recent years. While topic models such as Latent Dirichlet Allocation (LDA) have been successfully applied in topic discovery, they are often less coherent when applied to microblog content which is known as “posts”. In this paper, we propose a time-series based aggregation scheme for topic modeling in Weibo. As Weibo topics are coherent within a time slice, we divide Weibo dataset into groups by time slice. With this scheme, posts in every group are aggregated into several longer pseudo-documents using paragraph-vector based similarity algorithms. While applying this scheme to LDA model, we dramatically decrease the topic model perplexity and increase the clustering quality, which also allows for better discovery of underlying topics in Weibo. Furthermore, we can let other topic models extended on LDA be directly used on such short texts.
Keywords: Topic discovery; Short texts; Aggregation; Time-series (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S037843711930576X
Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:phsmap:v:536:y:2019:i:c:s037843711930576x
DOI: 10.1016/j.physa.2019.04.208
Access Statistics for this article
Physica A: Statistical Mechanics and its Applications is currently edited by K. A. Dawson, J. O. Indekeu, H.E. Stanley and C. Tsallis
More articles in Physica A: Statistical Mechanics and its Applications from Elsevier
Bibliographic data for series maintained by Catherine Liu ().