SESG-Optimizing Information Extraction in Chinese Clinical Texts: An Innovative Named Entity Recognition Approach Using RoBERTa-BiLSTM-CRF Mechanism

Li, Bin; Cheng, Haitao; Lin, Mengfei

SESG-Optimizing Information Extraction in Chinese Clinical Texts: An Innovative Named Entity Recognition Approach Using RoBERTa-BiLSTM-CRF Mechanism

Bin Li (), Haitao Cheng () and Mengfei Lin
Additional contact information
Bin Li: School of Humanities, Zhuhai City Polytechnic, Zhuhai, P. R. China
Haitao Cheng: School of Communication and Media, Guangzhou Huashang College, Guangzhou, P. R. China
Mengfei Lin: School of Economic and Management, Zhuhai City Polytechnic, Zhuhai, P. R. China

Journal of Information & Knowledge Management (JIKM), 2024, vol. 23, issue 06, 1-22

Abstract: Purpose: This study aims to enhance the efficiency and effectiveness of Chinese Clinical Named Entity Recognition by improving the Bert-BiLSTM-CRF model through the adoption of the RoBERTa pre-training model. Design/methodology/approach: A deep learning approach is employed, combining the RoBERTa pre-training model, Bi-directional Long Short-Term Memory (BiLSTM) network, and Conditional Random Field (CRF) model to form a Named Entity Recognition (NER) model. The model takes the pre-training model trained by the deep network model as input, mitigates the scarcity of annotated datasets, leverages the strong advantage of BiLSTM in learning the context information of words, and combines the CRF model to infer the ability of labels through global information. Findings: The RoBERTa-BiLSTM-CRF model has shown satisfactory results in the experiment. It enhances the reasoning ability between characters, allows the model to fully learn the feature information of the text, and improves the model performance to a certain extent. Originality/value: This paper proposes a RoBERTa medical named entity recognition model for the scarcity of annotated data in medical named entity recognition tasks and BERTâ€™s inability to obtain word-level information. The model is not limited to medical entity recognition tasks and shows potential for other medical natural language processing tasks, considering data enhancement, data optimization, and domain transfer on the model to improve model performance and generalization capabilities.

Keywords: RoBERTa-BiLSTM-CRF; NER; information extraction; deep learning; knowledge management (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649224500904
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:23:y:2024:i:06:n:s0219649224500904

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219649224500904

Access Statistics for this article

Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh

More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().