**Craggit 
*/Begin Log: Re-write to match your own file structure
*/Load data: Re-write to match your own file structure

clear
set mem 8m
use zam_fert_ex.dta

*/Examine data for dependant variables

tab basal_g
sum qbasal_g if basal_g==1

*/Estimate Cragg's double hurdle model

craggit basal_g disttown cland educ age sec(qbasal_g disttown cland educ age)

*/Notice the coefficients and standard errors (SE) are identical to tier by 
*/tier regression.  However, now all the coefficient estimates are in the 
*/system's active memory simultaneously, which will be useful.  First, we can
*/calculate x1g and x2b for each observation as a scaler (new variables).

predict x1g, eq(Tier1)
predict x2b, eq(Tier2)

*/We can also calculate the standard deviation of the second tier latent 
*/variable's error term for each observation.

predict sigma, eq(sigma)

*/Note: craggit is equipped to handle heteroskedastic standard errors where 
*/the SE is a function of observables using the 'het(varlist)' option.  If 
*/that is used, predict will generate a unique SE for each observation

*/Now, all the information we need to calculate the predicted values and 
*/partial effects for every observation is either predicted as a new variable
*/or stored in Stata's active memory.

*/To calculate the probability y=0, from equation (1)

gen Pw0 = 1 - normal(x1g)


*/To calculate the probabilty y>0, from equation (2)

gen Pw1 = normal(x1g)


*/To calculate expected values it is useful to generate a new variable for 
*/the Inverse Mills Ratio scaler:

gen IMR = normden(x2b/sigma)/normal(x2b/sigma)


*/To calculate the expected value of y, given that y > 0, from equation (3):

gen Eyyx2 = x2b + sigma*IMR


*/To calculate the unconditional expected value of y from equation (4):

gen Eyx1x2 = normal(x1g)*(x2b + sigma*IMR)


*/We can also calculate partial effects for a given independant variable. For
*/example, we can see the partial effect of "Hectares of land under 
*/cultivation" on probability of fertilizer use and fertilizer demand.

*/To calculate the partial effect on the probability that y > 0 
*/from equation (5):

gen dPw1_dxj = [Tier1]_b[cland]*normden(x1g)


*/To calculate the partial effect on the expected value of y, given y > 0,
*/from equation (6):

gen dEyyx2_dxj=[Tier2]_b[cland]*(1-IMR*(x2b/sigma+IMR))


*/to calculate the partial effect on the unconditional expected value of y 
*/from equation (7):

gen dEy_dxj=[Tier1]_b[cland]*normd(x1g)*(x2b+sigma*IMR) ///
+[Tier2]_b[cland]*normal(x1g)*(1-IMR*(x2b/sigma+IMR))

*/We can calculate the Average Partial Effect (APE) of 'cland' for the sample
*/on, say, the unconditional expected value using the summarize command. 

sum dEy_dxj


*/We can also see how this APE varies across, say, provinces using 'tabulate'

tab prov, sum(dEy_dxj)

*/Of course, the standard deviations from these summaries cannot be used for 
*/inference.  To evaluate the statistical significance of the APE, we can use
*/a series of commands to "bootstrap" a standard error.  This requires re-
*/calculating every step of the process of obtaining the APE numerous times.

**Inference using bootstrapping

capture program drop APEboot
program define APEboot, rclass
preserve
craggit basal_g disttown cland educ age, sec(qbasal_g disttown cland educ age)
predict bsx1g, eq(Tier1)
predict bsx2b, eq(Tier2)
predict bssigma, eq(sigma)
gen bsIMR = normden(bsx2b/bssigma)/normal(bsx2b/bssigma)
gen bsdEy_dxj=[Tier1]_b[cland]*normd(bsx1g)*(bsx2b+bssigma*bsIMR) ///
+[Tier2]_b[cland]*normal(bsx1g)*(1-bsIMR*(bsx2b/bssigma+bsIMR))
sum bsdEy_dxj
return scalar ape_xj=r(mean)
matrix ape_xj=r(ape_xj)
restore
end
 
 bootstrap ape_xj = r(ape_xj), reps(100): APEboot
 estat bootstrap, all

 */Depending on the model, this can be a very time consuming process, 
 */especially if you need inference on the APE of multiple variables.

 */An alternative would be to approximate the SE useing the Delta Method 

 **Inference using the Delta Method

 quietly craggit basal_g disttown cland educ age, sec(qbasal_g disttown cland
 educ age)

 sum x1g
 local x1gbar = r(mean)
 sum x2b
 local x2bbar = r(mean)

 nlcom [Tier1]_b[cland]*normd(`x1gbar')*(`x2bbar'+[sigma]_b[_cons]*	///
 (normden(`x2bbar'/[sigma]_b[_cons])/normal(`x2bbar'/[sigma]_b[_cons])))
 ///
 //+[Tier2]_b[cland]*normal(`x1gbar')*(1-(normden(`x2bbar'/
 /////
 //[sigma]_b[_cons])/normal(`x2bbar'/[sigma]_b[_cons]))*(`x2bbar'/
 /////
 //[sigma]_b[_cons]+(normden(`x2bbar'/[sigma]_b[_cons])/normal(`x2bbar'/
 /////
 //[sigma]_b[_cons]))))
 //
 //*/Now we can use the calculated APE and the SE from the delta method:
 //
 //di normden(4.6906668/3.012931)
 //
 //*/The p-value is .119
 //
 //