Counteracting French Fake News on Climate Change Using Language Models
Paul Meddeb,
Stefan Ruseti,
Mihai Dascalu (),
Simina-Maria Terian and
Sebastien Travadel
Additional contact information
Paul Meddeb: Centre of Research on Risks and Crisis Management, Mines Paris—PSL 1 Rue Claude Daunesse, 06560 Valbonne, France
Stefan Ruseti: Computer Science & Engineering Department, University Politehnica of Bucharest, 313 Splaiul Independentei, 060042 Bucharest, Romania
Mihai Dascalu: Computer Science & Engineering Department, University Politehnica of Bucharest, 313 Splaiul Independentei, 060042 Bucharest, Romania
Simina-Maria Terian: Department of Romance Studies, Lucian Blaga University of Sibiu, 10 Victoriei Blvd., 550024 Sibiu, Romania
Sebastien Travadel: Centre of Research on Risks and Crisis Management, Mines Paris—PSL 1 Rue Claude Daunesse, 06560 Valbonne, France
Sustainability, 2022, vol. 14, issue 18, 1-14
Abstract:
The unprecedented scale of disinformation on the Internet for more than a decade represents a serious challenge for democratic societies. When this process is focused on a well-established subject such as climate change, it can subvert measures and policies that various governmental bodies have taken to mitigate the phenomenon. It is therefore essential to effectively identify and counteract fake news on climate change. To do this, our main contribution represents a novel dataset with more than 2300 articles written in French, gathered using web scraping from all types of media dealing with climate change. Manual labeling was performed by two annotators with three classes: “fake”, “biased”, and “true”. Machine Learning models ranging from bag-of-words representations used by an SVM to Transformer-based architectures built on top of CamemBERT were built to automatically classify the articles. Our results, with an F1-score of 84.75% using the BERT-based model at the article level coupled with hand-crafted features specifically tailored for this task, represent a strong baseline. At the same time, we highlight perceptual properties as text sequences (i.e., fake, biased, and irrelevant text fragments) at the sentence level, with a macro F1 of 45.01% and a micro F1 of 78.11%. Based on these results, our proposed method facilitates the identification of fake news, and thus contributes to better education of the public.
Keywords: fake news detection; Natural Language Processing; sustainable education; Language Models; climate change (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/14/18/11724/pdf (application/pdf)
https://www.mdpi.com/2071-1050/14/18/11724/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:14:y:2022:i:18:p:11724-:d:918477
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().