EconPapers    
Economics at your fingertips  
 

Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web

Iván Castillo-Zúñiga, Francisco Javier Luna-Rosas, Laura C. Rodríguez-Martínez, Jaime Muñoz-Arteaga, Jaime Iván López-Veyna and Mario A. Rodríguez-Díaz
Additional contact information
Iván Castillo-Zúñiga: Instituto Tecnológico del Llano, Aguascalientes / Instituto Tecnológico de Aguascalientes, Aguascalientes, Mexico
Francisco Javier Luna-Rosas: TecNM/Instituto Tecnológico de Aguascalientes, Aguascalientes, Mexico
Laura C. Rodríguez-Martínez: Tecnológico Nacional de México/I.T. Aguascalientes, Mexico
Jaime Muñoz-Arteaga: Universidad Autonoma de Aguascalientes, Aguascalientes, Mexico
Jaime Iván López-Veyna: Instituto Tecnológico de Zacatecas, Zacatecas, Mexico
Mario A. Rodríguez-Díaz: TecNM/Instituto Tecnológico de Aguascalientes, Aguascalientes, Mexico

International Journal on Semantic Web and Information Systems (IJSWIS), 2020, vol. 16, issue 1, 69-86

Abstract: This article presents a methodology for the analysis of data on the Internet, combining techniques of Big Data analytics, NLP and semantic web in order to find knowledge about large amounts of information on the web. To test the effectiveness of the proposed method, webpages about cyberterrorism were analyzed as a case study. The procedure implemented a genetic strategy in parallel, which integrates (Crawler to locate and download information from the web; to retrieve the vocabulary, using techniques of NLP (tokenization, stop word, TF, TFIDF), methods of stemming and synonyms). For the pursuit of knowledge was built a dataset through the description of a linguistic corpus with semantic ontologies, considering the characteristics of cyber-terrorism, which was analyzed with the algorithms, Random Forests (parallel), Boosting, SVM, neural network, K-nn and Bayes. The results reveal a percentage of the 95.62% accuracy in the detection of the vocabulary of cyber-terrorism, which were approved through cross validation, reaching 576% time savings with parallel processing.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 18/IJSWIS.2020010104 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jswis0:v:16:y:2020:i:1:p:69-86

Access Statistics for this article

International Journal on Semantic Web and Information Systems (IJSWIS) is currently edited by Brij Gupta

More articles in International Journal on Semantic Web and Information Systems (IJSWIS) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jswis0:v:16:y:2020:i:1:p:69-86