EconPapers    
Economics at your fingertips  
 

STRGROUP: Stata module to match strings based on their Levenshtein edit distance

Julian Reif

Statistical Software Components from Boston College Department of Economics

Abstract: strgroup matches similar strings together. This can be useful when merging data that contain typos. For example, "widgets" will not merge with "widgetts" because the strings are not identical. strgroup provides a way to match strings in an objective and automated manner.

Language: Stata
Requires: Stata version 9.2
Keywords: data management; string match; string merge; string group; levenshtein (search for similar items in EconPapers)
Date: 2010-05-18, Revised 2023-08-22
Note: This module should be installed from within Stata by typing "ssc install strgroup". The module is made available under terms of the GPL v3 (https://www.gnu.org/licenses/gpl-3.0.txt). Windows users should not attempt to download these files with a web browser.
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://fmwww.bc.edu/repec/bocode/s/strgroup.ado program code (text/plain)
http://fmwww.bc.edu/repec/bocode/s/strgroup.hlp help file (text/plain)
http://fmwww.bc.edu/repec/bocode/l/levenshtein.ado program code (text/plain)
http://fmwww.bc.edu/repec/bocode/l/levenshtein.hlp help file (text/plain)
http://fmwww.bc.edu/repec/bocode/s/strgroup.macosx.plugin plugin library (application/x-binary)
http://fmwww.bc.edu/repec/bocode/s/strgroup.unix.plugin plugin library (application/x-binary)
http://fmwww.bc.edu/repec/bocode/s/strgroup.windows32.plugin plugin library (application/x-binary)
http://fmwww.bc.edu/repec/bocode/s/strgroup.windows64.plugin plugin library (application/x-binary)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:boc:bocode:s457151

Ordering information: This software item can be ordered from
http://repec.org/docs/ssc.php

Access Statistics for this software item

More software in Statistical Software Components from Boston College Department of Economics Boston College, 140 Commonwealth Avenue, Chestnut Hill MA 02467 USA. Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().

 
Page updated 2025-03-30
Handle: RePEc:boc:bocode:s457151