Integrating Feature and Instance Selection Techniques in Opinion Mining
Zi-Hung You,
Ya-Han Hu,
Chih-Fong Tsai and
Yen-Ming Kuo
Additional contact information
Zi-Hung You: Department of Nephrology, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan
Ya-Han Hu: Department of Information Management, National Central University, Taoyuan, Taiwan & Center for Innovative Research on Aging Society (CIRAS), Chiayi, National Chung Cheng University, Taiwan & MOST AI Biomedical Research Center at National Cheng Kung University, Tainan, Taiwan
Chih-Fong Tsai: Department of Information Management, National Central University, Taiwan
Yen-Ming Kuo: Department of Information Management, National Chung Cheng University, Chiayi, Taiwan
International Journal of Data Warehousing and Mining (IJDWM), 2020, vol. 16, issue 3, 168-182
Abstract:
Opinion mining focuses on extracting polarity information from texts. For textual term representation, different feature selection methods, e.g. term frequency (TF) or term frequency–inverse document frequency (TF–IDF), can yield diverse numbers of text features. In text classification, however, a selected training set may contain noisy documents (or outliers), which can degrade the classification performance. To solve this problem, instance selection can be adopted to filter out unrepresentative training documents. Therefore, this article investigates the opinion mining performance associated with feature and instance selection steps simultaneously. Two combination processes based on performing feature selection and instance selection in different orders, were compared. Specifically, two feature selection methods, namely TF and TF–IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiers showed that TF–IDF followed by DROP3 performs the best.
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDWM.2020070109 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:16:y:2020:i:3:p:168-182
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().