EconPapers    
Economics at your fingertips  
 

A delimiter‐based general approach for Chinese term extraction

Yuhang Yang, Qin Lu and Tiejun Zhao

Journal of the American Society for Information Science and Technology, 2010, vol. 61, issue 1, 111-125

Abstract: This article addresses a two‐step approach for term extraction. In the first step on term candidate extraction, a new delimiter‐based approach is proposed to identify features of the delimiters of term candidates rather than those of the term candidates themselves. This delimiter‐based method is much more stable and domain independent than the previous approaches. In the second step on term verification, an algorithm using link analysis is applied to calculate the relevance between term candidates and the sentences from which the terms are extracted. All information is obtained from the working domain corpus without the need for prior domain knowledge. The approach is not targeted at any specific domain and there is no need for extensive training when applying it to new domains. In other words, the method is not domain dependent and it is especially useful for resource‐limited domains. Evaluations of Chinese text in two different domains show quite significant improvements over existing techniques and also verify its efficiency and its relatively domain‐independent nature. The proposed method is also very effective for extracting new terms so that it can serve as an efficient tool for updating domain knowledge, especially for expanding lexicons.

Date: 2010
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.1002/asi.21221

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:61:y:2010:i:1:p:111-125

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:61:y:2010:i:1:p:111-125