EconPapers    
Economics at your fingertips  
 

Greasing the wheels for comparative communication research: Supervised text classification for multilingual corpora

Fabienne Lind, Tobias Heidenreich, Christoph Kralj and Hajo G. Boomgaarden

EconStor Open Access Articles and Book Chapters, 2021, vol. 3, issue 3, 1-30

Abstract: Employing supervised machine learning for text classification is already a resource-intensive endeavor in a monolingual setting. However, facing the challenge to classify a multilingual corpus, the cost of producing the required annotated documents quickly exceeds even generous time and financial constraints. We show how tools like automated annotation and machine translation can not only efficiently but also effectively be employed for the classification of a multilingual corpus with supervised machine learning. Our findings demonstrate that good results can already be achieved with the machine translation of about 250 to 350 documents per category class and language and a dictionary in just one language, which we perceive as a realistic scenario for many projects. The methodological strategy is applied to study migration frames in seven languages (news discourse in seven European countries) and discussed and evaluated for its usability in comparative communication research.

Keywords: multilingual content analysis; text classification; comparative communication research; supervised machine learning; machine translation (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.econstor.eu/bitstream/10419/250905/1/F ... asing-the-wheels.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:zbw:espost:250905

DOI: 10.5117/CCR2021.3.001.LIND

Access Statistics for this article

More articles in EconStor Open Access Articles and Book Chapters from ZBW - Leibniz Information Centre for Economics Contact information at EDIRC.
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().

 
Page updated 2025-03-20
Handle: RePEc:zbw:espost:250905