Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm
Mehreen Naz,
Kashif Zafar and
Ayesha Khan
Additional contact information
Mehreen Naz: Department of Computer Science; National University of Computer and Emerging Sciences, Lahore 54770, Pakistan
Kashif Zafar: Department of Computer Science; National University of Computer and Emerging Sciences, Lahore 54770, Pakistan
Ayesha Khan: School of Science and Technology; University of Management and Technology, Lahore 54782, Pakistan
Data, 2019, vol. 4, issue 2, 1-13
Abstract:
Feature subset selection is a process to choose a set of relevant features from a high dimensionality dataset to improve the performance of classifiers. The meaningful words extracted from data forms a set of features for sentiment analysis. Many evolutionary algorithms, like the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have been applied to feature subset selection problem and computational performance can still be improved. This research presents a solution to feature subset selection problem for classification of sentiments using ensemble-based classifiers. It consists of a hybrid technique of minimum redundancy and maximum relevance (mRMR) and Forest Optimization Algorithm (FOA)-based feature selection. Ensemble-based classification is implemented to optimize the results of individual classifiers. The Forest Optimization Algorithm as a feature selection technique has been applied to various classification datasets from the UCI machine learning repository. The classifiers used for ensemble methods for UCI repository datasets are the k-Nearest Neighbor (k-NN) and Naïve Bayes (NB). For the classification of sentiments, 15–20% improvement has been recorded. The dataset used for classification of sentiments is Blitzer’s dataset consisting of reviews of electronic products. The results are further improved by ensemble of k-NN, NB, and Support Vector Machine (SVM) with an accuracy of 95% for the classification of sentiment tasks.
Keywords: feature subset selection; classification; ensemble; evolutionary algorithms; data mining; sentiment analysis (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2019
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2306-5729/4/2/76/pdf (application/pdf)
https://www.mdpi.com/2306-5729/4/2/76/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:4:y:2019:i:2:p:76-:d:233814
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().