EconPapers    
Economics at your fingertips  
 

An evaluation of retrieval effectiveness using spelling‐correction and string‐similarity matching methods on Malay texts

Zainab Abu Bakar, Tengku Mohd T. Sembok and Mohammed Yusoff

Journal of the American Society for Information Science, 2000, vol. 51, issue 8, 691-706

Abstract: This article evaluates the effectiveness of spelling‐correction and string‐similarity matching methods in retrieving similar words in a Malay dictionary associated with a set of query words. The spelling‐correction techniques used are SPEEDCOP, Soundex, Davidson, Phonix, and Hartlib. Two dynamic‐programming methods that measure longest common subsequence and editcost‐distance are used. Several search combinations of query and dictionary words are performed in the experiments, the best being one that stems both query and dictionary words using an existing Malay stemming algorithm. The retrieval effectiveness (E) and retrieved and relevant (R&R) mean measures are calculated from weighted combination of recall and precision values. Results from these experiments are then compared with available digram, a string‐similarity method. The best R&R and E results are given by using digram. Editcost‐distances produce the best E results, and both dynamic‐programming methods rank second in finding R&R mean measures.

Date: 2000
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(2000)51:83.0.CO;2-U

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:51:y:2000:i:8:p:691-706

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571

Access Statistics for this article

More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamest:v:51:y:2000:i:8:p:691-706