Hybrid Feature Extraction for Multi-Label Emotion Classification in English Text Messages

Ahanin, Zahra; Ismail, Maizatul Akmar; Singh, Narinderjit Singh Sawaran; AL-Ashmori, Ammar

Hybrid Feature Extraction for Multi-Label Emotion Classification in English Text Messages

Zahra Ahanin, Maizatul Akmar Ismail (), Narinderjit Singh Sawaran Singh () and Ammar AL-Ashmori
Additional contact information
Zahra Ahanin: Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur 50603, Malaysia
Maizatul Akmar Ismail: Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur 50603, Malaysia
Narinderjit Singh Sawaran Singh: Faculty of Data Science and Information Technology, INTI International University, Nilai 71800, Malaysia
Ammar AL-Ashmori: Department of Computer and Information Sciences, University Technology PETRONAS, Seri Iskandar 32610, Malaysia

Sustainability, 2023, vol. 15, issue 16, 1-24

Abstract: Emotions are vital for identifying an individual’s attitude and mental condition. Detecting and classifying emotions in Natural Language Processing applications can improve Human–Computer Interaction systems, leading to effective decision making in organizations. Several studies on emotion classification have employed word embedding as a feature extraction method, but they do not consider the sentiment polarity of words. Moreover, relying exclusively on deep learning models to extract linguistic features may result in misclassifications due to the small training dataset. In this paper, we present a hybrid feature extraction model using human-engineered features combined with deep learning based features for emotion classification in English text. The proposed model uses data augmentation, captures contextual information, integrates knowledge from lexical resources, and employs deep learning models, including Bidirectional Long Short-Term Memory (Bi-LSTM) and Bidirectional Encoder Representation and Transformer (BERT), to address the issues mentioned above. The proposed model with hybrid features attained the highest Jaccard accuracy on two of the benchmark datasets, with 68.40% on SemEval-2018 and 53.45% on the GoEmotions dataset. The results show the significance of the proposed technique, and we can conclude that the incorporation of the hybrid features improves the performance of the baseline models.

Keywords: emotion classification; feature extraction; natural language processing; neural networks; word embeddings; social media (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/15/16/12539/pdf (application/pdf)
https://www.mdpi.com/2071-1050/15/16/12539/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:15:y:2023:i:16:p:12539-:d:1219822

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().