SynEL: A synthetic benchmark for entity linking
Ilia Karpov,
Alexander Kirillovich,
Elisaveta Goncharova,
Andrey Parinov,
Alexander Chernyavskiy,
Dmitry Ilvovsky,
Natalia Semenova,
Artyom Sosedka,
Ekaterina Lisitsyna and
Mikhail Belkin
PLOS ONE, 2026, vol. 21, issue 1, 1-18
Abstract:
Large language models (LLMs) offer significant potential for constructing commonsense knowledge graphs from text, demonstrating adaptability across diverse domains. However, their effectiveness varies significantly with domain-specific language, highlighting a critical need for specialized benchmarks to assess and optimize knowledge graph construction sub-tasks like named entity recognition, relation extraction, and entity linking. Currently, domain-specific benchmarks are scarce. To address this gap, we introduce SynEL, a novel benchmark developed for evaluating text-based knowledge extraction methods, validated using customer support dialogues. We present a comprehensive methodology for benchmark construction, propose two distinct approaches for generating synthetic datasets, and evaluate accumulated hallucinations. Our experiments reveal that existing LLMs experience a significant performance drop, with micro-F1 scores decreasing by up to 25 absolute points when extracting low-resource entities compared to high-resource entities from sources like Wikipedia. Furthermore, by incorporating synthetic datasets into the training process, we achieved an improvement in micro-F1 scores of up to 10 absolute points. We publicly release our benchmark and generation code to demonstrate its utility for fine-tuning and evaluating LLMs.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339468 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 39468&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0339468
DOI: 10.1371/journal.pone.0339468
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().