Machine-Learning-Based Approaches for Multi-Level Sentiment Analysis of Romanian Reviews
Anamaria Briciu (),
Alina-Delia Călin (),
Diana-Lucia Miholca (),
Cristiana Moroz-Dubenco,
Vladiela Petrașcu and
George Dascălu
Additional contact information
Anamaria Briciu: Department of Computer Science, Babeş-Bolyai University, 1 M. Kogalniceanu Street, 400084 Cluj-Napoca, Romania
Alina-Delia Călin: Department of Computer Science, Babeş-Bolyai University, 1 M. Kogalniceanu Street, 400084 Cluj-Napoca, Romania
Diana-Lucia Miholca: Department of Computer Science, Babeş-Bolyai University, 1 M. Kogalniceanu Street, 400084 Cluj-Napoca, Romania
Cristiana Moroz-Dubenco: Department of Computer Science, Babeş-Bolyai University, 1 M. Kogalniceanu Street, 400084 Cluj-Napoca, Romania
Vladiela Petrașcu: Department of Computer Science, Babeş-Bolyai University, 1 M. Kogalniceanu Street, 400084 Cluj-Napoca, Romania
George Dascălu: T2 S.R.L., 35 Ceauș Firică Street, 145100 Roșiori de Vede, Romania
Mathematics, 2024, vol. 12, issue 3, 1-37
Abstract:
Sentiment analysis has increasingly gained significance in commercial settings, driven by the rising impact of reviews on purchase decision-making in recent years. This research conducts a thorough examination of the suitability of machine learning and deep learning approaches for sentiment analysis, using Romanian reviews as a case study, with the aim of gaining insights into their practical utility. A comprehensive, multi-level analysis is performed, covering the document, sentence, and aspect levels. The main contributions of the paper refer to the in-depth exploration of multiple sentiment analysis models at three different textual levels and the subsequent improvements brought with respect to these standard models. Additionally, a balanced dataset of Romanian reviews from twelve product categories is introduced. The results indicate that, at the document level, supervised deep learning techniques yield the best outcomes (specifically, a convolutional neural network model that obtains an AUC value of 0.93 for binary classification and a weighted average F1-score of 0.77 in a multi-class setting with 5 target classes), albeit with increased resource consumption. Favorable results are achieved at the sentence level, as well, despite the heightened complexity of sentiment identification. In this case, the best-performing model is logistic regression, for which a weighted average F1-score of 0.77 is obtained in a multi-class polarity classification task with three classes. Finally, at the aspect level, promising outcomes are observed in both aspect term extraction and aspect category detection tasks, in the form of coherent and easily interpretable word clusters, encouraging further exploration in the context of aspect-based sentiment analysis for the Romanian language.
Keywords: sentiment analysis; latent semantic indexing; machine learning; deep learning; CNN; dense embedding layer; aspect term extraction; aspect category detection; Romanian language (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/3/456/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/3/456/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:3:p:456-:d:1330410
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().