Pre-Trained Language Model Ensemble for Arabic Fake News Detection
Lama Al-Zahrani () and
Maha Al-Yahya
Additional contact information
Lama Al-Zahrani: Information Technology Department, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi Arabia
Maha Al-Yahya: Information Technology Department, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi Arabia
Mathematics, 2024, vol. 12, issue 18, 1-17
Abstract:
Fake news detection (FND) remains a challenge due to its vast and varied sources, especially on social media platforms. While numerous attempts have been made by academia and the industry to develop fake news detection systems, research on Arabic content remains limited. This study investigates transformer-based language models for Arabic FND. While transformer-based models have shown promising performance in various natural language processing tasks, they often struggle with tasks involving complex linguistic patterns and cultural contexts, resulting in unreliable performance and misclassification problems. To overcome these challenges, we investigated an ensemble of transformer-based models. We experimented with five Arabic transformer models: AraBERT, MARBERT, AraELECTRA, AraGPT2, and ARBERT. Various ensemble approaches, including a weighted-average ensemble, hard voting, and soft voting, were evaluated to determine the most effective techniques for boosting learning models and improving prediction accuracies. The results of this study demonstrate the effectiveness of ensemble models in significantly boosting the baseline model performance. An important finding is that ensemble models achieved excellent performance on the Arabic Multisource Fake News Detection (AMFND) dataset, reaching an F1 score of 94% using weighted averages. Moreover, changing the number of models in the ensemble has a slight effect on the performance. These key findings contribute to the advancement of fake news detection in Arabic, offering valuable insights for both academia and the industry
Keywords: fake news detection; Arabic; learning ensemble; LLM; AraBERT; MARBERT; AraELECTRA; AraGPT2; ARBERT (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/18/2941/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/18/2941/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:18:p:2941-:d:1482716
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().