Application of the Tobit and Heckman Sample Selection Model
This page has been moved to https://econ.pages.code.wm.edu/407/notes/docs/index.html and is no longer being maintained here.
Tobit Application
The following code shows some of the mechanics for running a Tobit Model as well as ways we can use the model results after estimation. This uses a "toy dataset" that has only 1 dependent variable.
clear
webuse set "https://rlhick.people.wm.edu/econ407/data"
webuse toy_tobit
sum
(prefix now "https://rlhick.people.wm.edu/econ407/data")
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
index | 5,000 2499.5 1443.52 0 4999
y | 5,000 5.082118 .9811196 3.858572 8.662185
x | 5,000 -.0008725 1.001673 -3.620101 3.748413
The first 5 rows of data looks like this
list in 1/5
+--------------------------------+ | index y x | |--------------------------------| 1. | 0 4.2630254 -1.2008912 | 2. | 1 4.6382766 .1366034 | 3. | 2 4.776782 1.1964091 | 4. | 3 6.8389654 .71531347 | 5. | 4 5.2471644 .39031073 | +--------------------------------+
Of particular interest is our censored dependent variable \(\mathbf{y}\). The histogram is
hist y, frequency bin(20) graphregion(color(white)) ///
title("Histogram of the Censored Dependent Variable") xtitle("y")
graph export "/tmp/toy_tobit_hist.eps", replace
The scatterplot also shows how the censored values have been "stacked" on the lowering censoring point and how lower values of \(\mathbf{y}\) are dragged right towards the censoring point.
scatter x y, graphregion(color(white)) title("Scatterplot of x and y") ///
msize(tiny) xtitle("x") ytitle("y") xlab(0(.5)9)
graph export "/tmp/toy_tobit_scatter.eps", replace
../sitepics/toytobitscatter.png
To run the tobit model, we issue the command
egen a = min(y)
display "Lower Censoring Point (a): ", a
tobit y x, ll
Lower Censoring Point (a): 3.8585718
Refining starting values:
Grid node 0: log likelihood = -6924.13
Fitting full model:
Iteration 0: log likelihood = -6924.13
Iteration 1: log likelihood = -6813.1161
Iteration 2: log likelihood = -6810.4363
Iteration 3: log likelihood = -6810.4337
Iteration 4: log likelihood = -6810.4337
Tobit regression Number of obs = 5,000
Uncensored = 4,198
Limits: lower = 3.86 Left-censored = 802
upper = +inf Right-censored = 0
LR chi2(1) = 1074.36
Prob > chi2 = 0.0000
Log likelihood = -6810.4337 Pseudo R2 = 0.0731
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .5062554 .01488 34.02 0.000 .4770841 .5354267
_cons | 4.986882 .0147063 339.10 0.000 4.958051 5.015713
-------------+----------------------------------------------------------------
var(e.y)| 1.02864 .0232449 .9840649 1.075235
------------------------------------------------------------------------------
Outputs from the Tobit Model
Once we have run the model, we can use the results for creating parameter values that we might want to use for veryifying our understanding of the Tobit Model.
| Parameter | Stata Command |
|---|---|
| \(\sigma\) | sqrt(_b[/var(e.y)]) |
| \(log(L)\) | e(ll) |
| \(\beta_{name}\) | _b[name] |
| \(\mathbf{x}\hat{\beta}\) | predict xb, xb |
Example code using these special variables will create a new variable in your data space:
gen sigma = sqrt(_b[/var(e.y)])
gen logLike = e(ll)
gen beta_x = _b[x]
predict xb, xb
These can be used for replicating a number of the outputs available as postestimation commands. The following table lists the various expected values from the Tobit model, the formula it is based on, and the stata commands for generating marginal effects and predicted values, respectively.
| Expected Value | Formula | Stata Command |
|---|---|---|
| \(E[y \shortmid y\hspace{.03in} obs]\) | \(\mathbf{x}_i \beta^{T} + \sigma \frac{\phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)}{1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma}\right)}\) | margins, predict(e(a,.)) |
predict ycond, e(a,.) |
||
| \(Prob(Not Censored)\) | \(1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)\) | margins, pr(a,.) |
predict probobs, pr(a,.) |
||
| \(E[y]\) | \(\Phi \left (\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right ) a + \left (1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma}\right ) \right) \left [\mathbf{x}_i \beta^{T} + \sigma \frac{\phi \left( \frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)}{1-\Phi \left (\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)} \right ]\) | margins, predict(ystar(a,.)) |
predict y, predict(ystar(a,.) |
||
| \(E[y^*]\) | \(\mathbf{x}_i \beta^{T}\) | margins, xb |
predict xb, xb |
Noting that \(\beta^T\) are the estimates of \(\beta\) from the Tobit Model.
Heckman Application
The following code shows how to run a Heckman model using data that has completely non-overlapping variables in \(\mathbf{x}\), the independent variables in the amounts equation and \(\mathbf{w}\) the independent variables in the selection equation.
clear
webuse set "https://rlhick.people.wm.edu/econ407/data"
webuse toy_heckman
sum
(prefix now "https://rlhick.people.wm.edu/econ407/data")
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
index | 5,000 2499.5 1443.52 0 4999
y | 3,166 -1.504517 .9267993 -4.643568 1.950691
x | 5,000 .0038611 .9948256 -3.184449 4.071575
z | 5,000 .6332 .4819795 0 1
w | 5,000 .0051353 .9901703 -3.564716 3.639859
Note in this dataset, there are missing values for \(\mathbf{y}\) as a result of our selection mechanism.The first five observations looks like this:
list in 1/5
+--------------------------------------------------+ | index y x z w | |--------------------------------------------------| 1. | 0 . -2.9543519 0 -2.1461785 | 2. | 1 -1.4679083 .92509399 1 1.5765399 | 3. | 2 . .98621375 0 .05838758 | 4. | 3 -1.8399264 1.1407735 1 .70403742 | 5. | 4 -1.4611701 .42070096 1 .23800567 | +--------------------------------------------------+
Using this data, we estimate the very simple Heckman Model having only one variable each in the amounts and slection equation.
heckman y x, select(z = w)
Iteration 0: log likelihood = -6421.6346
Iteration 1: log likelihood = -6419.2129
Iteration 2: log likelihood = -6419.1934
Iteration 3: log likelihood = -6419.1934
Heckman selection model Number of obs = 5,000
(regression model with sample selection) Selected = 3,166
Nonselected = 1,834
Wald chi2(1) = 611.64
Log likelihood = -6419.193 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y |
x | .4847046 .0195987 24.73 0.000 .4462918 .5231174
_cons | -2.009821 .025109 -80.04 0.000 -2.059034 -1.960608
-------------+----------------------------------------------------------------
z |
w | .9816843 .0262184 37.44 0.000 .9302972 1.033071
_cons | .4758687 .0213349 22.30 0.000 .4340531 .5176844
-------------+----------------------------------------------------------------
/athrho | 1.106114 .0630869 17.53 0.000 .9824655 1.229762
/lnsigma | .0147216 .0168441 0.87 0.382 -.0182922 .0477353
-------------+----------------------------------------------------------------
rho | .8026843 .0224399 .7541312 .8425102
sigma | 1.01483 .0170939 .9818741 1.048893
lambda | .8145885 .0334737 .7489813 .8801956
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0): chi2(1) = 214.58 Prob > chi2 = 0.0000
As discussed in class, the command bundles the likelihood ratio test statistic (at the bottom) for testing whether \(\rho=0\) (the OLS Model could be applied) or not (the Heckman Model should be applied).
Outputs from the Heckman Model
Once we have run the model, we can use the results for creating parameter values that we might want to use for verifying our understanding of the Heckman Model.
| Parameter | Stata Command |
|---|---|
| \(\sigma\) | e(sigma) |
| \(\rho\) | e(rho) |
| \(log(L)\) | e(ll) |
| \(\beta_{name}\) | _b[name] |
| \(\gamma_{name}\) |
_b[z:name] (where z is the name of your selection dependent variable) |
| \(\mathbf{x}\hat{\beta}\) | predict xb, xb |
| \(\mathbf{w}\hat{\gamma}\) | predict zg, xbsel |
One thing worth noting is that we have two sets of parameters being maintained in the background by stata. To extract individual ones, given our variable names use syntax like this
gen beta_x = _b[x]
di "beta x coefficient: "beta_x
gen gamma_w = _b[z:w]
di "gamma w coefficient:", gamma_w
beta x coefficient: .48470458 gamma w coefficient: .98168427
Usually for this class, you will be working with the linear predictor, so you'll be calculating \(\mathbf{w}\gamma\) and \(\mathbf{x}\beta\) using
predict zg, xbsel
predict xb, xb
| Expected Value | Formula | Stata Command |
|---|---|---|
| \(E[y \shortmid y\hspace{.03in} obs]\) | \(\mathbf{x}_i \beta^{H} + \rho \sigma_{\epsilon} \frac{\phi(\mathbf{w}_i \gamma)}{\Phi(\mathbf{w}_i \gamma)}\) | margins, predict(ycond) |
predict ycond, ycond |
||
| \(Prob(Not Censored)\) | \(\Phi(\mathbf{w}_i \gamma)\) | margins, pr(psel) |
predict probobs, pr(psel) |
||
| \(E[y]\) | \(\Phi(\mathbf{w}_i \gamma)\left[\mathbf{x}_i \beta^H + \rho \sigma_{\epsilon} \frac{\phi(\mathbf{w}_i \gamma)}{\Phi(\mathbf{w}_i \gamma)}\right]\) | margins, predict(yexpected) |
predict y, yexpected |
||
| \(E[y^*]\) | \(\mathbf{x}_i \beta^{H}\) | margins, xb |
predict xb, xb |
Noting that \(\beta^T\) are the estimates of \(\beta\) from the Tobit Model.
For example, if we want to predict \(E[y \shortmid y\hspace{.03in} obs]\) for each observation we can just issue this command:
predict ycond, ycond
list y ycond in 1/10
variable ycond already defined
r(110);
+------------------------+
| y ycond |
|------------------------|
1. | . -1.771267 |
2. | -1.4679083 -1.518555 |
3. | . -1.130811 |
4. | -1.8399264 -1.269633 |
5. | -1.4611701 -1.473895 |
|------------------------|
6. | -1.4463141 -1.561146 |
7. | -2.0634319 -1.475339 |
8. | -.82478092 -2.157227 |
9. | -.75221966 -1.741791 |
10. | . -1.650602 |
+------------------------+
This shows that we can use the model to predict outcomes assuming everyone is selected.