Research on Entity and Relationship Extraction with Small Training Samples for Cotton Pests and Diseases

Yuan, Weiwei; Yang, Wanxia; He, Liang; Zhang, Tingwei; Hao, Yan; Lu, Jing; Yan, Wenbo

Research on Entity and Relationship Extraction with Small Training Samples for Cotton Pests and Diseases

Weiwei Yuan, Wanxia Yang (), Liang He, Tingwei Zhang, Yan Hao, Jing Lu and Wenbo Yan
Additional contact information
Weiwei Yuan: College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China
Wanxia Yang: College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China
Liang He: Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Tingwei Zhang: College of Plant Protection, Gansu Agricultural University, Lanzhou 730070, China
Yan Hao: College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China
Jing Lu: College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China
Wenbo Yan: College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China

Agriculture, 2024, vol. 14, issue 3, 1-16

Abstract: The extraction of entities and relationships is a crucial task in the field of natural language processing (NLP). However, existing models for this task often rely heavily on a substantial amount of labeled data, which not only consumes time and labor but also hinders the development of downstream tasks. Therefore, with a focus on enhancing the model’s ability to learn from small samples, this paper proposes an entity and relationship extraction method based on the Universal Information Extraction (UIE) model. The core of the approach is the design of a specialized prompt template and schema on cotton pests and diseases as one of the main inputs to the UIE, which, under its guided fine-tuning, enables the model to subdivide the entity and relationship in the corpus. As a result, the UIE-base model achieves an accuracy of 86.5% with only 40 labeled training samples, which really solves the problem of the existing models that require a large amount of manually labeled training data for knowledge extraction. To verify the generalization ability of the model in this paper, experiments are designed to compare the model with four classical models, such as the Bert-BiLSTM-CRF. The experimental results show that the F1 value on the self-built cotton data set is 1.4% higher than that of the Bert-BiLSTM-CRF model, and the F1 value on the public data set is 2.5% higher than that of the Bert-BiLSTM-CRF model. Furthermore, experiments are designed to verify that the UIE-base model has the best small-sample learning performance when the number of samples is 40. This paper provides an effective method for small-sample knowledge extraction.

Keywords: cotton pests and diseases; entity and relationship extraction; UIE; small-sample learning; fine-tuning (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2077-0472/14/3/457/pdf (application/pdf)
https://www.mdpi.com/2077-0472/14/3/457/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:14:y:2024:i:3:p:457-:d:1354990

Access Statistics for this article

Agriculture is currently edited by Ms. Leda Xuan

More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().