Named Entity System for Tweets in Hindi Language
Arti Jain and
Anuja Arora
Additional contact information
Arti Jain: Jaypee Institute of Information Technology, Noida, India
Anuja Arora: Jaypee Institute of Information Technology, Noida, India
International Journal of Intelligent Information Technologies (IJIIT), 2018, vol. 14, issue 4, 55-76
Abstract:
Due to the growing need of smart-health applications in Hindi language, there is a rapid demand for health-related Named Entity Recognition (NER) system for Hindi. For the purpose of the same, this research considers Twitter social network to extract tweets dated 1st October 2016 to 15th October 2017 from Patanjali, Dabur and other Hindi language-oriented Twitter based health sites; while considering four NE types- Person, Disease, Consumable and Organization. To the best of its knowledge, the considered Twitter dataset and NE types for Hindi language is one of the first resources that is being taken care. This article introduces three stage NER system for Tweets in Hindi language (HinTwtNER system)- pre-processing stage; machine Learning stage (Hyperspace Analogue to Language (HAL) and Conditional Random Field (CRF)); and post-processing stage. HinTwtNER looks into binary features and achieves an overall F-score of 49.87% which is comparable to the Twitter based NER systems for English and other languages.
Date: 2018
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJIIT.2018100104 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jiit00:v:14:y:2018:i:4:p:55-76
Access Statistics for this article
International Journal of Intelligent Information Technologies (IJIIT) is currently edited by Vijayan Sugumaran
More articles in International Journal of Intelligent Information Technologies (IJIIT) from IGI Global
Bibliographic data for series maintained by Journal Editor ().