EconPapers    
Economics at your fingertips  
 

Automatic generation of Japanese–English bilingual thesauri based on bilingual corpora

Keita Tsuji and Kyo Kageura

Journal of the American Society for Information Science and Technology, 2006, vol. 57, issue 7, 891-906

Abstract: The authors propose a method for automatically generating Japanese–English bilingual thesauri based on bilingual corpora. The term bilingual thesaurus refers to a set of bilingual equivalent words and their synonyms. Most of the methods proposed so far for extracting bilingual equivalent word clusters from bilingual corpora depend heavily on word frequency and are not effective for dealing with low‐frequency clusters. These low‐frequency bilingual clusters are worth extracting because they contain many newly coined terms that are in demand but are not listed in existing bilingual thesauri. Assuming that single language‐pair‐independent methods such as frequency‐based ones have reached their limitations and that a language‐pair‐dependent method used in combination with other methods shows promise, the authors propose the following approach: (a) Extract translation pairs based on transliteration patterns; (b) remove the pairs from among the candidate words; (c) extract translation pairs based on word frequency from the remaining candidate words; and (d) generate bilingual clusters based on the extracted pairs using a graph‐theoretic method. The proposed method has been found to be significantly more effective than other methods.

Date: 2006
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.20351

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:57:y:2006:i:7:p:891-906

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:57:y:2006:i:7:p:891-906