{smcl} {* 12nov2003/28oct2004/21apr2005/17aug2005/5nov2006}{...} {hline} help for {hi:qplot}{right:(SJ6-4: gr42_4; SJ5-3: gr42_3; SJ4-1: gr42_2;} {right:STB-61: gr42_1; STB-51: gr42)} {hline} {title:Quantile plots} {p 8 17 2} {cmd:qplot} {it:varname} {ifin} [{cmd:,} {cmd:over(}{it:varname}{cmd:)} {cmd:by(}{it:varname}[{cmd:,} {it:sub_options}]{cmd:)} {cmdab:miss:ing} {cmd:a(}{it:#}{cmd:)} {cmdab:rank:s} {cmdab:rev:erse} {cmdab:trsc:ale(}{it:transformation_syntax}{cmd:)} {cmdab:x:variable(}{it:varname}{cmd:)} {it:graph_options} ] {p 8 17 2} {cmd:qplot} {it:varlist} {ifin} [{cmd:,} {cmd:by(}{it:varname}[{cmd:,} {it:sub_options}]{cmd:)} {cmd:a(}{it:#}{cmd:)} {cmdab:rank:s} {cmdab:rev:erse} {cmdab:trsc:ale(}{it:transformation_syntax}{cmd:)} {cmdab:x:variable(}{it:varname}{cmd:)} {it:graph_options} ] {title:Description} {p 4 4 2}{cmd:qplot} produces a plot of the ordered values of one or more variables against the so-called plotting positions, which are essentially quantiles of a uniform distribution on [0,1] for the same number of values; the so-called unique ranks; a specified transformation of either of those; or a specified variable. {p 4 4 2}For {it:n} values of a variable {it:x} ordered so that {p 8 8 2}{it:x}[1] <= {it:x}[2] <= ... <= {it:x}[{it:n}-1] <= {it:x}[{it:n}] {p 4 4 2}the plotting positions are ({it:i} - {it:a}) / ({it:n} - 2{it:a} + 1) for {it:i} = 1, ..., {it:n}. The unique ranks run 1 to {it:n}; tied values being allocated different ranks so that each integer is assigned to a value. {p 4 4 2}For more than one variable in {it:varlist}, only observations with all values of {it:varlist} present are shown. {p 4 4 2}The plot is a scatterplot by default. It is possible to use {helpb advanced_options:recast()} to recast the plot as another {helpb graph_twoway:twoway} type, such as {cmd:connected}, {cmd:dot}, {cmd:dropline}, {cmd:line}, or {cmd:spike}. {title:Options} {p 4 8 2} {cmd:by(}{it:varname}[{cmd:,} {it:sub_options}]{cmd:)} specifies that calculations be carried out separately for each distinct value of a specified single variable. Results will be shown separately in distinct panels. See {it:{help by_option}}. {p 4 8 2} {cmd:over(}{it:varname}{cmd:)} specifies that calculations be carried out separately for each distinct value of a specified single variable. Curves will be shown together the same panel. {cmd:over()} is only allowed with a single {it:varname}. {p 4 8 2}{cmd:missing}, used only with {cmd:over()}, permits the use of non-missing values of {it:varname} corresponding to missing values for the variable named by {cmd:over()}. The default is to ignore such values. {p 4 8 2}{cmd:a(}{it:#}{cmd:)} specifies {it:a} in the formula for plotting position. The default is {it:a} = 0.5, giving ({it:i} - 0.5) / {it:n}. Other choices include {it:a} = 0, giving {it:i} / ({it:n} + 1), and {it:a} = 1/3, giving ({it:i} - 1/3) / ({it:n} + 1/3). {p 4 8 2}{cmd:ranks} specifies the use of ranks rather than plotting positions. {p 4 8 2}{cmd:reverse} reverses the sort order, so that values decrease from top left. Ordered values are plotted against 1 - plotting position or {it:n} - rank + 1. {p 4 8 2}{cmd:trscale(}{it:transformation_syntax}{cmd:)} specifies the use of an alternative transformed scale for plotting positions (or ranks) on the graph. Stata syntax should be used with {cmd:@} as placeholder for untransformed values. To show percents, specify {cmd:trscale(100 * @)}. To show probabilities on an inverse normal scale, specify {cmd:trscale(invnorm(@))}; on a logit scale, specify {cmd:trscale(logit(@))}; on a folded root scale, specify {cmd:trscale(sqrt(@) - sqrt(1 - @))}; on a loglog scale, specify {cmd:trscale(-log(-log(@)))}; on a cloglog scale, specify {cmd:trscale(cloglog( @)))}. Tools to make associated labels and ticks easier are available on SSC: see {stata ssc desc mylabels:ssc desc mylabels}. {p 4 8 2}{opt xvariable(varname)} specifies a preexisting plotting position or rank variable that should be used as the x-axis variable. {p 4 8 2}{it:graph_options} refers to options of {helpb graph} appropriate to the {it:plottype} specified. {title:Examples} {p 4 8 2}{cmd:. qplot mpg}{p_end} {p 4 8 2}{cmd:. qplot mpg, over(foreign) clp(l _) recast(line)}{p_end} {p 4 8 2}{cmd:. qplot length width height, recast(connected)}{p_end} {p 4 8 2}{cmd:. qplot mpg, reverse rank recast(spike) xla(1 10(10)70 74)}{p_end} {p 4 8 2}{cmd:. qplot mpg, recast(bar) barw(`=1/74') base(0)} {p 4 4 2}Ecologists often plot abundance data as Whittaker plots (Krebs, C. J. 1989. {it:Ecological methodology}. p. 344 New York: HarperCollins). {p 4 8 2}{cmd:. egen percent = pc(abundance)}{p_end} {p 4 8 2}{cmd:. qplot percent, rank reverse ysc(log) yti("Relative abundance, %")} {p 4 4 2}For more discussion, see Cox, N.J. 2005. The protean quantile plot. {it:Stata Journal} 5: 442{c -}460. {p 4 4 2}Hydrologists plot discharges in reverse order as flow duration curves, often with a logarithmic scale for discharge and a normal probability scale. {p 4 8 2}{cmd:. mylabels 1 2 5 10(10)90 95 98 99, myscale(invnorm(@/100)) local(plabels)}{p_end} {p 4 8 2}{cmd:. qplot discharge, reverse ysc(log) trscale(invnorm(@)) recast(line) xla(`plabels') xti("exceedance probability, %") yti("discharge, m{c -(}c 179{c )-}/s")} {title:Author} {p 4 4 2}Nicholas J. Cox, Durham University{break} n.j.cox@durham.ac.uk {title:Acknowledgment} {p 4 4 2}Patrick Royston suggested and first implemented what is here the {cmd:xvariable()} option. {title:Also see} {p 4 13 2}Manual: {hi:[G] graph}, {hi:[R] cumul}, {hi:[R] diagnostic plots} {p 4 13 2}Online: {helpb graph}, {helpb cumul}, {helpb quantile}, {helpb distplot} (if installed), {helpb mylabels} (if installed) {p_end}