Creating factor variables in resultssets and other datasets
Roger Newson
United Kingdom Stata Users' Group Meetings 2013 from Stata Users Group
Abstract:
Factor variables are defined as categorical variables with integer values, which may represent values of some other kind, specified by a value label. We frequently want to generate such variables in Stata datasets, especially resultssets, which are output Stata datasets produced by Stata programs such as the official Stata statsby command and the SSC packages parmest and xcontract. This is because categorical string variables can only be plotted after conversion to numeric variables and because these numeric variables are also frequently used in defining a key of variables, which identify observations in the resultsset uniquely in a sensible sort order. The sencode package is downloadable, and frequently downloaded, from SSC and is a “super†version of encode, which inputs a string variable and outputs a numeric factor variable. Its added features include a replace option allowing the output numeric variable to replace the input string variable, a gsort() option allowing the numeric values to be ordered in ways other than the alphabetical order of the input string values, and a manyto1 option allowing multiple output numeric values to map to the same input string value. The sencode package is well established and has existed since 2001. However, some tips will be given on ways of using it that are not immediately obvious but which the author has found very useful over the years when mass-producing resultssets. These applications use sencode with other commands, such as the official Stata command split and the SSC packages factmerg, factext, and fvregen.
Date: 2013-09-16
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://repec.org/usug2013/newson.uk13.pdf presentation materials (application/pdf)
http://repec.org/usug2013/newson_examples1.do sample file (text/plain)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:usug13:01
Access Statistics for this paper
More papers in United Kingdom Stata Users' Group Meetings 2013 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().