Categorization‐driven cross‐language retrieval of medical information
Hermes R. Freitas‐Junior,
Berthier Ribeiro‐Neto,
Rodrigo F. Vale,
Alberto H. F. Laender and
Luciano R. S. Lima
Journal of the American Society for Information Science and Technology, 2006, vol. 57, issue 4, 501-510
Abstract:
The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross‐language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross‐language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language‐independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross‐language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
Date: 2006
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.20320
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:57:y:2006:i:4:p:501-510
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890
Access Statistics for this article
More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().