iefieldkit: Stata commands for primary data collection and cleaning
Benjamin Daniels,
Luiza Cardoso de Andrade and
Kristoffer Bjarkefur
Additional contact information
Luiza Cardoso de Andrade: The World Bank Group
Kristoffer Bjarkefur: The World Bank Group
2019 Stata Conference from Stata Users Group
Abstract:
Data collection and cleaning workflows use highly repetitive but extremely important processes. -iefieldkit- was developed to standardize and simplify best practices for high-quality primary data collection across the 100+ members of the World Bank's Development Research Group, Impact Evaluations team (DIME). It automates: error-checking for electronic ODK-based survey modules such as those implemented in SurveyCTO; duplicate checking and resolution; data cleaning including renaming, labeling, recoding, and survey harmonization; and codebook creation. The presentation will outline how the -iefieldkit- package is intended to provide a data collection workflow skeleton for nearly any type of primary data collection, from questionnaire design to data import. One feature of many -iefieldkit- commands is their utilization of spreadsheet-based workflows, which reduce repetitive coding in Stata and document corrections and cleaning in a human-readable format. This enables rapid review of data quality in a standardized process, with the goal of producing maximally clean primary data for the downstream data construction and analysis phases in a transparent and accessible manner. These tools are developed open-source on GitHub and available publicly.
Date: 2019-08-02
References: Add references at CitEc
Citations:
Downloads: (external link)
http://fmwww.bc.edu/repec/scon2019/chicago19_Daniels.pdf
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:scon19:11
Access Statistics for this paper
More papers in 2019 Stata Conference from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().