EconPapers    
Economics at your fingertips  
 

Using clustering strategies for creating authority files

James C. French, Allison L. Powell and Eric Schulman

Journal of the American Society for Information Science, 2000, vol. 51, issue 8, 774-786

Abstract: As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of information. Authority work, the need to discover and reconcile variant forms of strings in bibliographic entries, will become more critical in the future. Spelling variants, misspellings, and transliteration differences will all increase the difficulty of retrieving information. We investigate a number of approximate string matching techniques that have traditionally been used to help with this problem. We then introduce the notion of approximate word matching and show how it can be used to improve detection and categorization of variant forms. We demonstrate the utility of these approaches using data from the Astrophysics Data System and show how we can reduce the human effort involved in the creation of authority files.

Date: 2000
References: Add references at CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(2000)51:83.0.CO;2-P

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:51:y:2000:i:8:p:774-786

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571

Access Statistics for this article

More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamest:v:51:y:2000:i:8:p:774-786