Towards Self-Contained Data: Attaching Validation Routines to Variables
William Rising ()
Additional contact information
William Rising: Bellarmine University
North American Stata Users' Group Meetings 2006 from Stata Users Group
Abstract:
One of Stata's great strengths is its data management abilities. When either building or sharing data sets, some of the most time-consuming activities are validating the data and writing documentation for the data. Much of this futility could be avoided if data sets were self-contained, i.e. if they could validate themselves. Showing how this can be done within Stata is the purpose of this talk. What will be demonstrated is a package of commands for attaching validation rules to the variables themselves, via characteristics, along with commands for running error checks and marking suspicious observations in the data set. The validation system is flexible enough that simple checks continue to work even if variable names change or if the data are reshaped, and is rich enough that validation may depend on other variables in the data set. Since the validation is at the variable level, the self-validation also works if variables are recombined with data from other data sets. With these tools, Stata's data sets will become truly self-contained.
Date: 2006-07-23
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/nasug2006/ckvarTalk.beamer.pdf (application/pdf)
http://repec.org/nasug2006/CheckvarChar_v1.0.0.zip (application/zip)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:asug06:10
Access Statistics for this paper
More papers in North American Stata Users' Group Meetings 2006 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().