Adapting support vector optimisation algorithms to textual gender classification
Javier Gomez (),
Cesar Alfaro (),
Felipe Ortega (),
Javier M. Moguerza (),
Maria Jesus Algar () and
Raul Moreno ()
Additional contact information
Javier Gomez: Rey Juan Carlos University
Cesar Alfaro: Rey Juan Carlos University
Felipe Ortega: Rey Juan Carlos University
Javier M. Moguerza: Rey Juan Carlos University
Maria Jesus Algar: Rey Juan Carlos University
Raul Moreno: Rey Juan Carlos University
TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, 2024, vol. 32, issue 3, No 5, 463-488
Abstract:
Abstract In this paper, we focus on the problem of determining the gender of the person described in a biographical text. Since support vector machine classifiers are well suited for text classification tasks, we present a new stopping criterion for support vector optimisation algorithms tailored to this problem. This new approach exploits the geometric properties of the vector representation of such content. An experiment on a set of English and Spanish biographical articles retrieved from Wikipedia illustrates this approach and compares it to other machine learning classification algorithms. The proposed method allows real-time classification algorithm training. Moreover, these results confirm the advantage of leveraging additional gender information in strongly inflected languages, like Spanish, for this task.
Keywords: Support vector machines; Machine learning; Nonlinear optimisation; Text mining; Gender identification; 68U15; 68T50 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s11750-024-00671-1 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00671-1
Ordering information: This journal article can be ordered from
http://link.springer.de/orders.htm
DOI: 10.1007/s11750-024-00671-1
Access Statistics for this article
TOP: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Juan José Salazar González and Gustavo Bergantiños
More articles in TOP: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().