From online hate speech to offline hate crime: the role of inflammatory language in forecasting violence against migrant and LGBT communities
Carlos Arcila Calderón (),
Patricia Sánchez Holgado,
Jesús Gómez,
Marcos Barbosa,
Haodong Qi,
Alberto Matilla,
Pilar Amado,
Alejandro Guzmán,
Daniel López-Matías and
Tomás Fernández-Villazala
Additional contact information
Carlos Arcila Calderón: University of Salamanca
Patricia Sánchez Holgado: University of Salamanca
Jesús Gómez: Secretary of State for Security, Ministry of Interior
Marcos Barbosa: University of Salamanca
Haodong Qi: Malmö University
Alberto Matilla: Secretary of State for Security, Ministry of Interior
Pilar Amado: Secretary of State for Security, Ministry of Interior
Alejandro Guzmán: Universidad Autónoma de Madrid
Daniel López-Matías: Universidad Rey Juan Carlos
Tomás Fernández-Villazala: Secretary of State for Security, Ministry of Interior
Palgrave Communications, 2024, vol. 11, issue 1, 1-14
Abstract:
Abstract Social media messages often provide insights into offline behaviors. Although hate speech proliferates rapidly across social media platforms, it is rarely recognized as a cybercrime, even when it may be linked to offline hate crimes that typically involve physical violence. This paper aims to anticipate violent acts by analyzing online hate speech (hatred, toxicity, and sentiment) and comparing it to offline hate crime. The dataset for this preregistered study included social media posts from X (previously called Twitter) and Facebook and internal police records of hate crimes reported in Spain between 2016 and 2018. After conducting preliminary data analysis to check the moderate temporal correlation, we used time series analysis to develop computational models (VAR, GLMNet, and XGBTree) to predict four time periods of these rare events on a daily and weekly basis. Forty-eight models were run to forecast two types of offline hate crimes, those against migrants and those against the LGBT community. The best model for migrant crime achieved an R2 of 64%, while that for LGBT crime reached 53%. According to the best ML models, the weekly aggregations outperformed the daily aggregations, the national models outperformed those geolocated in Madrid, and those about migration were more effective than those about LGBT people. Moreover, toxic language outperformed hatred and sentiment analysis, Facebook posts were better predictors than tweets, and in most cases, speech temporally preceded crime. Although we do not make any claims about causation, we conclude that online inflammatory language could be a leading indicator for detecting potential hate crimes acts and that these models can have practical applications for preventing these crimes.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1057/s41599-024-03899-1 Abstract (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pal:palcom:v:11:y:2024:i:1:d:10.1057_s41599-024-03899-1
Ordering information: This journal article can be ordered from
https://www.nature.com/palcomms/about
DOI: 10.1057/s41599-024-03899-1
Access Statistics for this article
More articles in Palgrave Communications from Palgrave Macmillan
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().