Comparing Deep-Learning Architectures and Traditional Machine-Learning Approaches for Satire Identification in Spanish Tweets
Óscar Apolinario-Arzube,
José Antonio García-Díaz,
José Medina-Moreira,
Harry Luna-Aveiga and
Rafael Valencia-García
Additional contact information
Óscar Apolinario-Arzube: Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Cdla, Universitaria Salvador Allende, Guayaquil 090514, Ecuador
José Antonio García-Díaz: Facultad de Informática, Universidad de Murcia, Campus de Espinardo, 30100 Murcia, Spain
José Medina-Moreira: Facultad de Ciencias Agrarias, Universidad Agraria del Ecuador, Av. 25 de Julio, Guayaquil 090114, Ecuador
Harry Luna-Aveiga: Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Cdla, Universitaria Salvador Allende, Guayaquil 090514, Ecuador
Rafael Valencia-García: Facultad de Informática, Universidad de Murcia, Campus de Espinardo, 30100 Murcia, Spain
Mathematics, 2020, vol. 8, issue 11, 1-23
Abstract:
Automatic satire identification can help to identify texts in which the intended meaning differs from the literal meaning, improving tasks such as sentiment analysis, fake news detection or natural-language user interfaces. Typically, satire identification is performed by training a supervised classifier for finding linguistic clues that can determine whether a text is satirical or not. For this, the state-of-the-art relies on neural networks fed with word embeddings that are capable of learning interesting characteristics regarding the way humans communicate. However, as far as our knowledge goes, there are no comprehensive studies that evaluate these techniques in Spanish in the satire identification domain. Consequently, in this work we evaluate several deep-learning architectures with Spanish pre-trained word-embeddings and compare the results with strong baselines based on term-counting features. This evaluation is performed with two datasets that contain satirical and non-satirical tweets written in two Spanish variants: European Spanish and Mexican Spanish. Our experimentation revealed that term-counting features achieved similar results to deep-learning approaches based on word-embeddings, both outperforming previous results based on linguistic features. Our results suggest that term-counting features and traditional machine learning models provide competitive results regarding automatic satire identification, slightly outperforming state-of-the-art models.
Keywords: automatic satire identification; text classification; natural language processing (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2020
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/8/11/2075/pdf (application/pdf)
https://www.mdpi.com/2227-7390/8/11/2075/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:8:y:2020:i:11:p:2075-:d:448299
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().