Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval
Na Deng and
Caiquan Xiong
Additional contact information
Na Deng: Hubei University of Technology, China
Caiquan Xiong: Hubei University of Technology, China
International Journal of Data Warehousing and Mining (IJDWM), 2020, vol. 16, issue 3, 87-107
Abstract:
In the retrieval and mining of traditional Chinese medicine (TCM) patents, a key step is Chinese word segmentation and named entity recognition. However, the alias phenomenon of traditional Chinese medicines causes great challenges to Chinese word segmentation and named entity recognition in TCM patents, which directly affects the effect of patent mining. Because of the lack of a comprehensive Chinese herbal medicine name thesaurus, traditional thesaurus-based Chinese word segmentation and named entity recognition are not suitable for medicine identification in TCM patents. In view of the present situation, using the language characteristics and structural characteristics of TCM patent texts, a modified and serialized co-training method to recognize medicine names from TCM patent abstract texts is proposed. Experiments show that this method can maintain high accuracy under relatively low time complexity. In addition, this method can also be expanded to the recognition of other named entities in TCM patents, such as disease names, preparation methods, and so on.
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
https://services.igi-global.com/resolvedoi/resolve ... 018/IJDWM.2020070105 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:16:y:2020:i:3:p:87-107
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().