Application of the Near Miss Strategy and Edit Distance to Handle Dirty Data
Cihan Varol,
Coskun Bayrak,
Rick Wagner and
Dana Goff
Additional contact information
Cihan Varol: University of Arkansas at Little Rock
Coskun Bayrak: University of Arkansas at Little Rock
Rick Wagner: Acxiom Corporation
Dana Goff: Acxiom Corporation
Chapter 5 in Data Engineering, 2009, pp 91-101 from Springer
Abstract:
Abstract In today’s information age, processing customer information in a standardized and accurate manner is known to be a difficult task. Data collection methods vary from source to source by format, volume, and media type. Therefore, it is advantageous to deploy customized data hygiene techniques to standardize the data for meaningfulness and usefulness based on the organization.
Keywords: Edit Distance; Optical Character Recognition; Spelling Error; Cognitive Error; Spelling Correction (search for similar items in EconPapers)
Date: 2009
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:isochp:978-1-4419-0176-7_5
Ordering information: This item can be ordered from
http://www.springer.com/9781441901767
DOI: 10.1007/978-1-4419-0176-7_5
Access Statistics for this chapter
More chapters in International Series in Operations Research & Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().