EconPapers    
Economics at your fingertips  
 

Graph-Based Siamese Network for Authorship Verification

Daniel Embarcadero-Ruiz, Helena Gómez-Adorno, Alberto Embarcadero-Ruiz and Gerardo Sierra
Additional contact information
Daniel Embarcadero-Ruiz: Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
Helena Gómez-Adorno: Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
Alberto Embarcadero-Ruiz: Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
Gerardo Sierra: Instituto de Ingeniería, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico

Mathematics, 2022, vol. 10, issue 2, 1-24

Abstract: In this work, we propose a novel approach to solve the authorship identification task on a cross-topic and open-set scenario. Authorship verification is the task of determining whether or not two texts were written by the same author. We model the documents in a graph representation and then a graph neural network extracts relevant features from these graph representations. We present three strategies to represent the texts as graphs based on the co-occurrence of the POS labels of words. We propose a Siamese Network architecture composed of graph convolutional networks along with pooling and classification layers. We present different variants of the architecture and discuss the performance of each one. To evaluate our approach we used a collection of fanfiction texts provided by the PAN@CLEF 2021 shared task in two settings: a “small” corpus and a “large” corpus. Our graph-based approach achieved average scores (AUC ROC, F1, Brier score, F0.5u, and C@1) between 90% and 92.83% when training on the “small” and “large” corpus, respectively. Our model obtain results comparable to those of the state of the art in this task and greater than traditional baselines.

Keywords: authorship verification; graph neural networks; text graphs; Siamese network; POS tags (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/2/277/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/2/277/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:2:p:277-:d:726262

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:10:y:2022:i:2:p:277-:d:726262