Multi-Distribution Characteristics Based Chinese Entity Synonym Extraction from The Web
Xiuxia Ma,
Xiangfeng Luo,
Subin Huang and
Yike Guo
Additional contact information
Xiuxia Ma: School of Computer Engineering and Science, Shanghai University, Shanghai, China
Xiangfeng Luo: School of Computer Engineering and Science, Shanghai University, Shanghai, China
Subin Huang: School of Computer Engineering and Science, Shanghai University, Shanghai, China
Yike Guo: Imperial College London, London, UK
International Journal of Intelligent Information Technologies (IJIIT), 2019, vol. 15, issue 3, 42-63
Abstract:
Entity synonyms play an important role in natural language processing applications, such as query expansion and question answering. There are three main distribution characteristics in web texts:1) appearing in parallel structures; 2) occurring with specific patterns in sentences; and 3) distributed in similar contexts. The first and second characteristics rely on reliable prior knowledge and are susceptive to data sparseness, bringing high accuracy and low recall to synonym extraction. The third one may lead to high recall but low accuracy, since it identifies a somewhat loose semantic similarity. Existing methods, such as context-based and pattern-based methods, only consider one characteristic for synonym extraction and rarely take their complementarity into account. For increasing recall, this article proposes a novel extraction framework that can combine the three characteristics for extracting synonyms from the web, where an Entity Synonym Network (ESN) is built to incorporate synonymous knowledge. To improve accuracy, the article treats synonym detection as a ranking problem and uses the Spreading Activation model as a ranking means to detect the hard noise in ESN. Experimental results show the proposed method achieves better accuracy and recall than the state-of-the-art methods.
Date: 2019
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJIIT.2019070103 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jiit00:v:15:y:2019:i:3:p:42-63
Access Statistics for this article
International Journal of Intelligent Information Technologies (IJIIT) is currently edited by Vijayan Sugumaran
More articles in International Journal of Intelligent Information Technologies (IJIIT) from IGI Global
Bibliographic data for series maintained by Journal Editor ().