EconPapers    
Economics at your fingertips  
 

A New Methodology for Chinese Term Extraction from Scientific Publications

Huaili Zheng () and Ting Jiang
Additional contact information
Huaili Zheng: School of Computer and Artificial Intelligence, Nanjing University of Finance & Economics, Nanjing, China
Ting Jiang: School of Computer and Artificial Intelligence, Nanjing University of Finance & Economics, Nanjing, China

Innovation & Technology Advances, 2025, vol. 3, issue 2, 19-45

Abstract: To identify Chinese technical terms, this study focuses on extracting terms from a corpus of scientific publications. The process begins with the identification of term boundaries, followed by the application of Chinese part-of-speech (POS) patterns to extract candidate terms. Features of words or characters that signal term boundaries are defined, enabling the segmentation of sentences into smaller units and facilitating the removal of irrelevant terms that may not be filtered by other approaches. POS patterns are specifically designed for the extraction of Chinese technical terms. A comparison between candidate terms extracted using these POS patterns and those obtained via n-gram models shows that the proposed POS-based method effectively eliminates a significant portion of non-relevant terms while retaining most useful ones. In the term scoring phase, a novel method based on contextual information—referred to as the Hellinger distance for context information acquisition—is introduced. This approach proves more effective than existing context-based methods. Subsequently, the Hellinger distance method is integrated with Kullback–Leibler divergence to evaluate terms along the dimensions of informativeness and phraseness. The proposed term scoring method is compared with eight alternative approaches. Results demonstrate that it outperforms others in scoring Chinese terms, particularly in the extraction of multi-word terms.

Keywords: Automatic Term Extraction; Technical Term Extraction; Terminology Extraction; Context Information; Chinese Term Extraction (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://bergersci.com/index.php/jta/article/view/222/65 (application/pdf)
https://bergersci.com/index.php/jta/article/view/222 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cwi:itadva:v:3:y:2025:i:2:p:19-45

DOI: 10.61187/ita.v3i2.222

Access Statistics for this article

More articles in Innovation & Technology Advances from Berger Science Press
Bibliographic data for series maintained by Berger Science Press ().

 
Page updated 2026-02-14
Handle: RePEc:cwi:itadva:v:3:y:2025:i:2:p:19-45