Home  >>  Archives  >>  Volume 8 Number 4  >>  dm0042

The Stata Journal
Volume 8 Number 4: pp. 557-568

Subscribe to the Stata Journal

Speaking Stata: Distinct observations

Nicholas J. Cox
Department of Geography
Durham University
Durham City, UK
Gary M. Longton
Fred Hutchinson Cancer Research Center
Seattle, WA
Abstract.   Distinct observations are those different with respect to one or more variables, considered either individually or jointly. Distinctness is thus a key aspect of the similarity or difference of observations. It is sometimes confounded with uniqueness. Counting the number of distinct observations may be required at any point from initial data cleaning or checking to subsequent statistical analysis. We review how far existing commands in official Stata offer solutions to this issue, and we show how to answer questions about distinct observations from first principles by using the by prefix and the egen command. The new distinct command is offered as a convenience tool.
Terms of use     View this article (PDF)

View all articles by these authors: Nicholas J. Cox, Gary M. Longton

View all articles with these keywords: distinct, by, egen, distinctness, uniqueness, data management

Download citation: BibTeX  RIS

Download citation and abstract: BibTeX  RIS