Speaking Stata: Distinct observations
Nicholas Cox and
Gary M. Longton ()
Additional contact information
Gary M. Longton: Fred Hutchinson Cancer Research Center, Seattle
Stata Journal, 2008, vol. 8, issue 4, 557-568
Abstract:
Distinct observations are those different with respect to one or more variables, considered either individually or jointly. Distinctness is thus a key aspect of the similarity or difference of observations. It is sometimes confounded with uniqueness. Counting the number of distinct observations may be required at any point from initial data cleaning or checking to subsequent statistical analysis. We review how far existing commands in official Stata offer solutions to this issue, and we show how to answer questions about distinct observations from first principles by using the by prefix and the egen command. The new distinct command is offered as a convenience tool.
Keywords: distinct; by; egen; distinctness; uniqueness; data management (search for similar items in EconPapers)
Date: 2008
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://www.stata-journal.com/article.html?article=dm0042
http://www.stata-journal.com/software/sj8-4/dm0042/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:8:y:2008:i:4:p:557-568
Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html
Access Statistics for this article
Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins
More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().