An NLP-Based Framework to Spot Extremist Networks in Social Media

Rozo, AndrÃ©s Zapata; DÃ­az-LÃ³pez, Daniel; Pastor-Galindo, Javier; MÃ¡rmol, FÃ©lix GÃ³mez; Karabiyik, Umit; Xiong, Fei

An NLP-Based Framework to Spot Extremist Networks in Social Media

AndrÃ©s Zapata Rozo, Daniel DÃaz-LÃ³pez, Javier Pastor-Galindo, FÃ©lix GÃ³mez MÃ¡rmol, Umit Karabiyik and Fei Xiong

Complexity, 2024, vol. 2024, 1-24

Abstract: Governments and law enforcement agencies (LEAs) are increasingly concerned about growing illicit activities in cyberspace, such as cybercrimes, cyberespionage, cyberterrorism, and cyberwarfare. In the particular context of cyberterrorism, hostile social manipulation (HSM) represents a strategy that employs different manipulation methods, mostly through social media, to promote extremism in social groups and encourage hostile behavior against a target. Thus, this paper proposes a framework based on natural language processing (NLP) that detects and inspects supposed HSM actions to support law enforcement agencies (LEAs) in the prevention of cyberterrorism. The proposal integrates different NLP techniques through three models: (i) a similarity model that relates content with similar semantic meaning, (ii) a polarity analysis model that estimates polarity, and (iii) a named-entity recognition (NER) model that recognizes relevant entities. In addition, our proposed framework is evaluated in each of its components through exhaustive experiments and is tested with a particular use case related to violent protests in Ecuador in October 2021. Use caseâ€™s results indicate that 3 and 4 clusters are obtained when Spanish and English-translated tweets are used, respectively. An analysis of polarity over English-translated tweets allows us to identify, through two different methods, the most negative cluster (#1). The results of the extraction of the mentions show that our framework is able to identify entities of the type of person that may be at risk with a precision of 89.91%. Knowledge graphs achieved in our use case allow us to identify how nodes that promote HSM are interconnected and work collaboratively. Finally, the computational costs of our proposal are quite favorable as memory consumption of similarity and polarity models is proportional to the number of processed tweets, confirming the feasibility of the solution in a real context.

Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2024/3380488.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2024/3380488.xml (application/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:3380488

DOI: 10.1155/2024/3380488

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().