AN INTERNET REVIEW TOPIC HIERARCHY MINING METHOD BASED ON MODIFIED CONTINUOUS RENORMALIZATION PROCEDURE
Lin Qi,
Fei-Yan Guo,
Jian Zhang and
Yu-Wei Wang
Additional contact information
Lin Qi: School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, P. R. China
Fei-Yan Guo: School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, P. R. China†Beijing World Urban Circular Economy System (Industry), Collaborative Innovation Center, Beijing 100192, P. R. China
Jian Zhang: School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, P. R. China‡Beijing International Science and Technology Cooperation, Base of Intelligent Decision and Big Data Application, Beijing 100192, P. R. China
Yu-Wei Wang: �Department of East Asian Studies, University of Arizona, Tucson AZ85719, USA
FRACTALS (fractals), 2022, vol. 30, issue 07, 1-25
Abstract:
Mining the hierarchical structure of Internet review topics and realizing a fine classification of review texts can help alleviate users’ information overload. However, existing hierarchical topic classification methods primarily rely on external corpora and human intervention. This study proposes a Modified Continuous Renormalization (MCR) procedure that acts on the keyword co-occurrence network with fractal characteristics to achieve the topic hierarchy mining. First, the fractal characteristics in the keyword co-occurrence network of Internet review text are identified using a box-covering algorithm for the first time. Then, the MCR algorithm established on the edge adjacency entropy and the box distance is proposed to obtain the topic hierarchy in the keyword co-occurrence network. Verification data from the Dangdang.com book reviews shows that the MCR constructs topic hierarchies with greater coherence and independence than the HLDA and the Louvain algorithms. Finally, reliable review text classification is achieved using the MCR extended bottom-level topic categories. The accuracy rate (P), recall rate (R) and F1 value of Internet review text classification obtained from the MCR-based topic hierarchy are significantly improved compared to four target text classification algorithms.
Keywords: Internet Reviews; Keyword Co-Occurrence Network; Fractal; Box-Covering; Renormalization; Text Topic Hierarchy Mining (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0218348X22501341
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:fracta:v:30:y:2022:i:07:n:s0218348x22501341
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0218348X22501341
Access Statistics for this article
FRACTALS (fractals) is currently edited by Tara Taylor
More articles in FRACTALS (fractals) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().