EconPapers    
Economics at your fingertips  
 

Quality of Word Vectors and its Impact on Named Entity Recognition in Czech

František Dařena and Martin Süss
Additional contact information
Martin Süss: Mendel University in Brno, Czech Republic

European Journal of Business Science and Technology, 2020, vol. 6, issue 2, 154-169

Abstract: Named Entity Recognition (NER) focuses on finding named entities in text and classifying them into one of the entity types. Modern state-of-the-art NER approaches avoid using hand-crafted features and rely on feature-inferring neural network systems based on word embeddings. The paper analyzes the impact of different aspects related to word embeddings on the process and results of the named entity recognition task in Czech, which has not been investigated so far. Various aspects of word vectors preparation were experimentally examined to draw useful conclusions. The suitable settings in different steps were determined, including the used corpus, number of word vectors dimensions, used text preprocessing techniques, context window size, number of training epochs, and word vectors inferring algorithms and their specific parameters. The paper demonstrates that focusing on the process of word vectors preparation can bring a significant improvement for NER in Czech even without using additional language independent and dependent resources.

Keywords: Named Entity Recognition; word embeddings; word vectors training; natural language processing; Czech language (search for similar items in EconPapers)
JEL-codes: C63 C88 (search for similar items in EconPapers)
Date: 2020
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://ejobsat.cz/doi/10.11118/ejobsat.2020.010.html (text/html)
http://ejobsat.cz/doi/10.11118/ejobsat.2020.010.pdf (application/pdf)
free of charge

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:men:journl:v:6:y:2020:i:2:p:154-169

DOI: 10.11118/ejobsat.2020.010

Access Statistics for this article

European Journal of Business Science and Technology is currently edited by Svatopluk Kapounek

More articles in European Journal of Business Science and Technology from Mendel University in Brno, Faculty of Business and Economics Contact information at EDIRC.
Bibliographic data for series maintained by Ivo Andrle ().

 
Page updated 2025-03-19
Handle: RePEc:men:journl:v:6:y:2020:i:2:p:154-169