Methodological Approach for Identifying Websites with Infringing Content via Text Transformers and Dense Neural Networks
Aldo Hernandez-Suarez (),
Gabriel Sanchez-Perez,
Linda Karina Toscano-Medina,
Hector Manuel Perez-Meana,
Jose Portillo-Portillo and
Jesus Olivares-Mercado
Additional contact information
Aldo Hernandez-Suarez: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Gabriel Sanchez-Perez: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Linda Karina Toscano-Medina: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Hector Manuel Perez-Meana: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Jose Portillo-Portillo: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Jesus Olivares-Mercado: Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico
Future Internet, 2023, vol. 15, issue 12, 1-31
Abstract:
The rapid evolution of the Internet of Everything (IoE) has significantly enhanced global connectivity and multimedia content sharing, simultaneously escalating the unauthorized distribution of multimedia content, posing risks to intellectual property rights. In 2022 alone, about 130 billion accesses to potentially non-compliant websites were recorded, underscoring the challenges for industries reliant on copyright-protected assets. Amidst prevailing uncertainties and the need for technical and AI-integrated solutions, this study introduces two pivotal contributions. First, it establishes a novel taxonomy aimed at safeguarding and identifying IoE-based content infringements. Second, it proposes an innovative architecture combining IoE components with automated sensors to compile a dataset reflective of potential copyright breaches. This dataset is analyzed using a Bidirectional Encoder Representations from Transformers-based advanced Natural Language Processing (NLP) algorithm, further fine-tuned by a dense neural network (DNN), achieving a remarkable 98.71% accuracy in pinpointing websites that violate copyright.
Keywords: dense neural network; privacy violations; illegal download; BERT; natural language processing; infringing content (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1999-5903/15/12/397/pdf (application/pdf)
https://www.mdpi.com/1999-5903/15/12/397/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:15:y:2023:i:12:p:397-:d:1297089
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().