EconPapers    
Economics at your fingertips  
 

Sentiment Analysis of Multilingual Dataset of Bahraini Dialects, Arabic, and English

Thuraya Omran (), Baraa Sharef, Crina Grosan and Yongmin Li
Additional contact information
Thuraya Omran: Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK
Baraa Sharef: Department of Information Technology, College of Information Technology, Ahlia University, Manama P.O. Box 10878, Bahrain
Crina Grosan: Division of Applied Technologies for Clinical Care, King’s College London, London WC2R 2LS, UK
Yongmin Li: Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK

Data, 2023, vol. 8, issue 4, 1-13

Abstract: Sentiment analysis is an application of natural language processing (NLP) that requires a machine learning algorithm and a dataset. In some cases, the dataset availability is scarce, particularly with Arabic dialects, precisely the Bahraini ones, which necessitates using an approach such as translation, where a rich source language is exploited to create the target language dataset. In this study, a dataset of Amazon product reviews in Bahraini dialects is presented. This dataset was generated using two cascading stages of translation—a machine translation followed by a manual one. Machine translation was applied using Google Translate to translate English Amazon product reviews into Standard Arabic. In contrast, the manual approach was applied to translate the resulting Arabic reviews into Bahraini ones by qualified native speakers utilizing constructed customized forms. The resulting parallel dataset of English, Standard Arabic, and Bahraini dialects is called English_Modern Standard Arabic_Bahraini Dialects product reviews for sentiment analysis “E_MSA_BDs-PR-SA”. The dataset is balanced, composed of 2500 positive and 2500 negative reviews. The sentiment analysis process was implemented using a stacked LSTM deep learning model. The Bahraini dialect product dataset can be utilized in the transfer learning process for sentimentally analyzing another dataset in Bahraini dialects.

Keywords: Bahraini dialects resources; Bahraini resources scarcity; deep learning; products reviews (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/8/4/68/pdf (application/pdf)
https://www.mdpi.com/2306-5729/8/4/68/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:8:y:2023:i:4:p:68-:d:1111305

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jdataj:v:8:y:2023:i:4:p:68-:d:1111305