ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition

Boudjellal, Nada; Zhang, Huaping; Khan, Asif; Ahmad, Arshad; Naseem, Rashid; Shang, Jianyun; Dai, Lin; Khan, Atif

ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition

Nada Boudjellal, Huaping Zhang, Asif Khan, Arshad Ahmad, Rashid Naseem, Jianyun Shang, Lin Dai and Atif Khan

Complexity, 2021, vol. 2021, 1-6

Abstract: The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named-entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT-based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small-scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state-of-the-art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1-score.

Date: 2021
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2021/6633213.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2021/6633213.xml (application/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:6633213

DOI: 10.1155/2021/6633213

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().