EconPapers    
Economics at your fingertips  
 

HARNESSING CORPUS LINGUISTICS AND DATA-DRIVEN LEARNING APPROACH FOR ARABIC MULTI-CLASS DIALECT DETECTION AND CLASSIFICATION

Muhammad Swaileh A. Alzaidi, Alya Alshammari, Saad Alahmari, Hanan Al Sultan, Abdulkhaleq Q. A. Hassan and Ahmed S. Salama
Additional contact information
Muhammad Swaileh A. Alzaidi: Department of English Language, College of Language Sciences, King Saud University, P. O. Box 145111, Riyadh, Saudi Arabia
Alya Alshammari: ��Department of Applied Linguistics, College of Languages, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh 11671, Saudi Arabia
Saad Alahmari: ��Department of Computer Science, Applied College, Northern Border University, Arar 91431, Saudi Arabia
Hanan Al Sultan: �Department of English, College of Arts, King Faisal University, Saudi Arabia
Abdulkhaleq Q. A. Hassan: �Department of English, College of Science and Arts at Mahayil, King Khalid University, Saudi Arabia
Ahmed S. Salama: ��Department of Electrical Engineering, Faculty of Engineering & Technology, Future University in Egypt, New Cairo 11845, Egypt

FRACTALS (fractals), 2025, vol. 33, issue 02, 1-14

Abstract: Arabic dialect identification (ADI) is a specific task of natural language processing (NLP) that intends to forecast the Arabic language dialect of the input text automatically. ADI is the preliminary step toward establishing many NLP applications, including cross-language text generation, multilingual text-to-speech synthesis, and machine translation. The automatic classification of the Arabic dialect is the first step in various dialect-sensitive Arabic NLP tasks. ADI includes predicting the dialects related to the textual input and classifying them on their respective labels. As a result, increased interest has been gained in the last few decades to address the problems of ADI through deep learning (DL) and machine learning (ML) algorithms. The study develops an Arabic multi-class dialect recognition using fast random opposition-based fractals learning aquila optimizer with DL (FROBLAO-DL) technique. The FROBLAO-DL technique utilizes the optimal DL model to identify distinct types of Arabic dialects. In the FROBLAO-DL technique, data preprocessing is involved in cleaning the input Arabic dialect dataset. In addition, the ROBERTa word embedding process is used to generate word embedding. The FROBLAO-DL technique uses attention bidirectional long short-term memory (ABiLSTM) network to identify distinct Arabic dialects. Moreover, the ABiLSTM model’s hyperparameter tuning is implemented using the FROBLOA method. The performance evaluation of the FROBLAO-DL method is tested under the Arabic dialect dataset. The empirical analysis implies the supremacy of the FROBLAO-DL technique over recent approaches under various measures.

Keywords: Arabic Dialect Identification; Data-driven Approach; Deep Learning; Fractals Aquila Optimizer; ROBERTa; Natural Language Processing (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0218348X25400079
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:fracta:v:33:y:2025:i:02:n:s0218348x25400079

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0218348X25400079

Access Statistics for this article

FRACTALS (fractals) is currently edited by Tara Taylor

More articles in FRACTALS (fractals) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-04-19
Handle: RePEc:wsi:fracta:v:33:y:2025:i:02:n:s0218348x25400079