clear
set more off
* Let's load the mroz dataset and the lfp variable as the dv:
webuse set "http://rlhick.people.wm.edu/econ407/data"
webuse mroz
* put faminc into 10000 dollars
replace faminc=faminc/10000
* suppose we think that wife's education is endogenous in a labor force participation equation
* We could run this:
ivprobit lfp kl6 k618 faminc (we=wmed)
* Search google and you'll find many people asking for ivlogit: none exists in stata
* The reason is that the likelihood function does not have a closed form solution since adjusting
* the standard errors to account for the fact that the instrumented value of we is a random
* variable. Simply using the predicted values as a regressor without accounting
* for the fact that it is a random variable will lead to bad standard errors (underestimated).
* To get around this, bootstrap:
* first define an eclass program that first runs the relevancy equation and then
* include the residual in the original logit equation to recover the correct
* b's
capture program drop ivlogit
program ivlogit, eclass
version 11
tempname ivbeta
tempname esample
tempname resid
* first run the relevancy equation:
reg we kl6 k618 faminc wmed
predict resid, r
* now include the residual in the logit regression to yield the correct
* beta's from the iv regression
logit lfp kl6 k618 faminc we resid
matrix `ivbeta' = e(b)
* need to drop resid so the next replicate can execute the predict command
drop resid
ereturn post `ivbeta'
ereturn local cmd="bootstrap"
end
* note that once we scale the ivprobit results by .6, we get almost exactly the same
* means and confidence intervales with our ivlogit command. You might argue that our
* approach is superior to ivprobit because it accounts of *any* general error sructure
* and does not rely on normality:
* This forces normality:
bootstrap _b, reps(100): ivlogit
* This recycles the previous results and calculates the upper and lower .025 percentiles
* rather than forcing symmetry via the std deviation and is the preferred method
* (called the non-parametric bootstrap):
estat bootstrap, percentile
* to test for exogeneity of we we would need to do some further programming work
* (save the estimates and variance covariance) and the do a hausman test compared
* to a standard logit model that assumes we is exogenous (from the regression below).