EconPapers    
Economics at your fingertips  
 

Record linkage using Stata: Preprocessing, linking, and reviewing utilities

Nada Wasi and Aaron Flaaen

Stata Journal, 2015, vol. 15, issue 3, 672-697

Abstract: In this article, we describe Stata utilities that facilitate probabilistic record linkage—the technique typically used for merging two datasets with no common record identifier. While the preprocessing tools are developed specifically for linking two company databases, the other tools can be used for many different types of linkage. Specifically, the stnd compname and stnd address commands parse and standardize company names and addresses to improve the match quality when linking. The reclink2 command is a generalized version of Blasnik’s reclink (2010, Statistical Software Components S456876, Department of Economics, Boston College) that allows for many-to-one matching. Finally, clrevmatch is an interactive tool that allows the user to review matched results in an efficient and seamless manner. Rather than exporting results to another file format (for example, Excel), inputting clerical reviews, and importing back into Stata, one can use the clrevmatch tool to conduct all of these steps within Stata. This helps improve the speed and flexibility of matching, which often involves multiple runs. Copyright 2015 by StataCorp LP.

Keywords: reclink2; clrevmatch; reclink; stnd compname; stnd address; record linkage; fuzzy matching; string standardization (search for similar items in EconPapers)
Date: 2015
Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj15-3/dm0082/
References: Add references at CitEc
Citations: View citations in EconPapers (40)

Downloads: (external link)
http://www.stata-journal.com/article.html?article=dm0082 link to article purchase

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:15:y:2015:i:3:p:672-697

Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html

Access Statistics for this article

Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins

More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().

 
Page updated 2025-03-20
Handle: RePEc:tsj:stataj:v:15:y:2015:i:3:p:672-697