The Relation Dimension in the Identification and Classification of Lexically Restricted Word Co-Occurrences in Text Corpora

Shvets, Alexander; Wanner, Leo

The Relation Dimension in the Identification and Classification of Lexically Restricted Word Co-Occurrences in Text Corpora

Alexander Shvets and Leo Wanner ()
Additional contact information
Alexander Shvets: NLP Group, Pompeu Fabra University, 08018 Barcelona, Spain
Leo Wanner: Catalan Institute for Research and Advanced Studies (ICREA) and NLP Group, Pompeu Fabra University, 08018 Barcelona, Spain

Mathematics, 2022, vol. 10, issue 20, 1-21

Abstract: The speech of native speakers is full of idiosyncrasies. Especially prominent are lexically restricted binary word co-occurrences of the type high esteem , strong tea , run [ an ] experiment , war break(s) out , etc. In lexicography, such co-occurrences are referred to as collocations . Due to their semi-decompositional nature, collocations are of high relevance to a large number of natural language processing applications as well as to second language learning. A substantial body of work exists on the automatic recognition of collocations in textual material and, increasingly also on their semantic classification, even if not yet in the mainstream research. Especially classification with respect to the lexical function (LF) taxonomy, which is the most detailed semantically oriented taxonomy of collocations available to date, proved to be of real use to human speakers and machines alike. The most recent approaches in the field are based on multilingual neural graph transformer models that use explicit syntactic dependencies. Our goal is to explore whether the extension of such a model by a semantic relation extraction network improves its classification performance or whether it already learns the corresponding semantic relations from the dependencies and the sentential contexts, such that an additional relation extraction network will not improve the overall performance. The experiments show that the semantic relation extraction layer indeed improves the overall performance of a graph transformer. However, this improvement is not very significant, such that we can conclude that graph transformers already learn to a certain extent the semantics of the dependencies between the collocation elements.

Keywords: idiosyncratic word co-occurrences; collocations; lexical functions; multilingual; graph transformers; multitask learning; semantic relation extraction (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/20/3831/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/20/3831/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:20:p:3831-:d:944388

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().