clear set more off * Let's load the mroz dataset and the lfp variable as the dv: webuse set "http://rlhick.people.wm.edu/econ407/data" webuse mroz * put faminc into 10000 dollars replace faminc=faminc/10000 * suppose we think that wife's education is endogenous in a labor force participation equation * We could run this: ivprobit lfp kl6 k618 faminc (we=wmed) * Search google and you'll find many people asking for ivlogit: none exists in stata * The reason is that the likelihood function does not have a closed form solution since adjusting * the standard errors to account for the fact that the instrumented value of we is a random * variable. Simply using the predicted values as a regressor without accounting * for the fact that it is a random variable will lead to bad standard errors (underestimated). * To get around this, bootstrap: * first define an eclass program that first runs the relevancy equation and then * include the residual in the original logit equation to recover the correct * b's capture program drop ivlogit program ivlogit, eclass version 11 tempname ivbeta tempname esample tempname resid * first run the relevancy equation: reg we kl6 k618 faminc wmed predict resid, r * now include the residual in the logit regression to yield the correct * beta's from the iv regression logit lfp kl6 k618 faminc we resid matrix `ivbeta' = e(b) * need to drop resid so the next replicate can execute the predict command drop resid ereturn post `ivbeta' ereturn local cmd="bootstrap" end * note that once we scale the ivprobit results by .6, we get almost exactly the same * means and confidence intervales with our ivlogit command. You might argue that our * approach is superior to ivprobit because it accounts of *any* general error sructure * and does not rely on normality: * This forces normality: bootstrap _b, reps(100): ivlogit * This recycles the previous results and calculates the upper and lower .025 percentiles * rather than forcing symmetry via the std deviation and is the preferred method * (called the non-parametric bootstrap): estat bootstrap, percentile * to test for exogeneity of we we would need to do some further programming work * (save the estimates and variance covariance) and the do a hausman test compared * to a standard logit model that assumes we is exogenous (from the regression below).