EconPapers    
Economics at your fingertips  
 

Methodological Approach for Identifying Websites with Infringing Content via Text Transformers and Dense Neural Networks

Aldo Hernandez-Suarez (), Gabriel Sanchez-Perez, Linda Karina Toscano-Medina, Hector Manuel Perez-Meana, Jose Portillo-Portillo and Jesus Olivares-Mercado
Additional contact information
Aldo Hernandez-Suarez: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Gabriel Sanchez-Perez: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Linda Karina Toscano-Medina: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Hector Manuel Perez-Meana: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Jose Portillo-Portillo: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Jesus Olivares-Mercado: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico

Future Internet, 2023, vol. 15, issue 12, 1-31

Abstract: The rapid evolution of the Internet of Everything (IoE) has significantly enhanced global connectivity and multimedia content sharing, simultaneously escalating the unauthorized distribution of multimedia content, posing risks to intellectual property rights. In 2022 alone, about 130 billion accesses to potentially non-compliant websites were recorded, underscoring the challenges for industries reliant on copyright-protected assets. Amidst prevailing uncertainties and the need for technical and AI-integrated solutions, this study introduces two pivotal contributions. First, it establishes a novel taxonomy aimed at safeguarding and identifying IoE-based content infringements. Second, it proposes an innovative architecture combining IoE components with automated sensors to compile a dataset reflective of potential copyright breaches. This dataset is analyzed using a Bidirectional Encoder Representations from Transformers-based advanced Natural Language Processing (NLP) algorithm, further fine-tuned by a dense neural network (DNN), achieving a remarkable 98.71% accuracy in pinpointing websites that violate copyright.

Keywords: dense neural network; privacy violations; illegal download; BERT; natural language processing; infringing content (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/15/12/397/pdf (application/pdf)
https://www.mdpi.com/1999-5903/15/12/397/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:15:y:2023:i:12:p:397-:d:1297089

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jftint:v:15:y:2023:i:12:p:397-:d:1297089