Combining Lexicon Definitions and the Retrieval-Augmented Generation of a Large Language Model for the Automatic Annotation of Ancient Chinese Poetry
Jiabin Li,
Tingxin Wei,
Weiguang Qu (),
Bin Li,
Minxuan Feng and
Dongbo Wang
Additional contact information
Jiabin Li: School of Liberal Arts, Nanjing Normal University, Nanjing 210023, China
Tingxin Wei: School of International Culture and Education, Nanjing Normal University, Nanjing 210023, China
Weiguang Qu: School of Liberal Arts, Nanjing Normal University, Nanjing 210023, China
Bin Li: School of Liberal Arts, Nanjing Normal University, Nanjing 210023, China
Minxuan Feng: School of Liberal Arts, Nanjing Normal University, Nanjing 210023, China
Dongbo Wang: School of Information Management, Nanjing Agricultural University, Nanjing 210095, China
Mathematics, 2025, vol. 13, issue 12, 1-19
Abstract:
Existing approaches to the automatic annotation of classical Chinese poetry often fail to generate precise source citations and depend heavily on manual segmentation, limiting their scalability and accuracy. To address these shortcomings, we propose a novel paradigm that integrates dictionary retrieval with retrieval-augmented large language model enhancements for automatic poetic annotation. Our method leverages the contextual understanding capabilities of large models to dynamically select appropriate lexical senses and employs an automated segmentation technique to minimize reliance on manual splitting. For poetic segments absent from standard dictionaries, the system retrieves pertinent information from a domain-specific knowledge base and generates definitions grounded in this auxiliary data, thereby substantially improving both annotation accuracy and coverage. The experimental results demonstrate that our approach outperforms general-purpose large language models and pre-trained classical Chinese language models on automatic annotation tasks; notably, it achieves a micro-averaged accuracy of 94.33% on key semantic segments. By delivering more precise and comprehensive annotations, this framework advances the computational analysis of classical Chinese poetry and offers significant potential for intelligent teaching applications and digital humanities research.
Keywords: automatic annotation; knowledge base construction; large language model (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/12/2023/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/12/2023/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:12:p:2023-:d:1682660
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().