An evaluation of retrieval effectiveness using spelling‐correction and string‐similarity matching methods on Malay texts
Zainab Abu Bakar,
Tengku Mohd T. Sembok and
Mohammed Yusoff
Journal of the American Society for Information Science, 2000, vol. 51, issue 8, 691-706
Abstract:
This article evaluates the effectiveness of spelling‐correction and string‐similarity matching methods in retrieving similar words in a Malay dictionary associated with a set of query words. The spelling‐correction techniques used are SPEEDCOP, Soundex, Davidson, Phonix, and Hartlib. Two dynamic‐programming methods that measure longest common subsequence and editcost‐distance are used. Several search combinations of query and dictionary words are performed in the experiments, the best being one that stems both query and dictionary words using an existing Malay stemming algorithm. The retrieval effectiveness (E) and retrieved and relevant (R&R) mean measures are calculated from weighted combination of recall and precision values. Results from these experiments are then compared with available digram, a string‐similarity method. The best R&R and E results are given by using digram. Editcost‐distances produce the best E results, and both dynamic‐programming methods rank second in finding R&R mean measures.
Date: 2000
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(2000)51:83.0.CO;2-U
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:51:y:2000:i:8:p:691-706
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().