Semantic Data Matching: Principles and Performance
Russell Deaton,
Thao Doan and
Tom Schweiger
Additional contact information
Russell Deaton: Computer Science and Computer Engineering, University of Arkansas
Thao Doan: Computer Science and Computer Engineering, University of Arkansas
Tom Schweiger: Acxiom Corporation
Chapter 4 in Data Engineering, 2009, pp 77-90 from Springer
Abstract:
Abstract Automated and real-time management of customer relationships requires robust and intelligent data matching across widespread and diverse data sources. Simple string matching algorithms, such as dynamic programming, can handle typographical errors in the data, but are less able to match records that require contextual and experiential knowledge. Latent Semantic Indexing (LSI)latent semantic indexing (LSI) (Berry et al. ; Deerwester et al. is a machine intelligence technique that can match data based upon higher order structure, and is able to handle difficult problems, such as words that have different meanings but the same spelling, are synonymous, or have multiple meanings. Essentially, the technique matches records based upon context, or mathematically quantifying when terms occur in the same record.
Keywords: Machine Intelligence Techniques; Simple String Matching Algorithm; Latent Semantic Indexing (LSI); Dispersed Data Sources; Higher Order Structure (search for similar items in EconPapers)
Date: 2009
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:isochp:978-1-4419-0176-7_4
Ordering information: This item can be ordered from
http://www.springer.com/9781441901767
DOI: 10.1007/978-1-4419-0176-7_4
Access Statistics for this chapter
More chapters in International Series in Operations Research & Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().