Application of the Tobit and Heckman Sample Selection Model

This page has been moved to https://econ.pages.code.wm.edu/407/notes/docs/index.html and is no longer being maintained here.

Tobit Application

The following code shows some of the mechanics for running a Tobit Model as well as ways we can use the model results after estimation. This uses a "toy dataset" that has only 1 dependent variable.

clear
webuse set "https://rlhick.people.wm.edu/econ407/data"
webuse toy_tobit
sum
(prefix now "https://rlhick.people.wm.edu/econ407/data")

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       index |      5,000      2499.5     1443.52          0       4999
           y |      5,000    5.082118    .9811196   3.858572   8.662185
           x |      5,000   -.0008725    1.001673  -3.620101   3.748413

The first 5 rows of data looks like this

list in 1/5

   +--------------------------------+
   | index           y            x |
   |--------------------------------|
1. |     0   4.2630254   -1.2008912 |
2. |     1   4.6382766     .1366034 |
3. |     2    4.776782    1.1964091 |
4. |     3   6.8389654    .71531347 |
5. |     4   5.2471644    .39031073 |
   +--------------------------------+

Of particular interest is our censored dependent variable \(\mathbf{y}\). The histogram is

hist y, frequency bin(20) graphregion(color(white)) ///
        title("Histogram of the Censored Dependent Variable") xtitle("y")
graph export "/tmp/toy_tobit_hist.eps", replace

The scatterplot also shows how the censored values have been "stacked" on the lowering censoring point and how lower values of \(\mathbf{y}\) are dragged right towards the censoring point.

scatter x y, graphregion(color(white)) title("Scatterplot of x and y") ///
             msize(tiny) xtitle("x") ytitle("y") xlab(0(.5)9)
graph export "/tmp/toy_tobit_scatter.eps", replace

../sitepics/toytobitscatter.png

To run the tobit model, we issue the command

egen a = min(y)
display "Lower Censoring Point (a): ", a
tobit y x, ll
Lower Censoring Point (a):  3.8585718

Refining starting values:

Grid node 0:   log likelihood =   -6924.13

Fitting full model:

Iteration 0:   log likelihood =   -6924.13
Iteration 1:   log likelihood = -6813.1161
Iteration 2:   log likelihood = -6810.4363
Iteration 3:   log likelihood = -6810.4337
Iteration 4:   log likelihood = -6810.4337

Tobit regression                                Number of obs     =      5,000
                                                   Uncensored     =      4,198
Limits: lower = 3.86                               Left-censored  =        802
        upper = +inf                               Right-censored =          0

                                                LR chi2(1)        =    1074.36
                                                Prob > chi2       =     0.0000
Log likelihood = -6810.4337                     Pseudo R2         =     0.0731

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .5062554     .01488    34.02   0.000     .4770841    .5354267
       _cons |   4.986882   .0147063   339.10   0.000     4.958051    5.015713
-------------+----------------------------------------------------------------
     var(e.y)|    1.02864   .0232449                      .9840649    1.075235
------------------------------------------------------------------------------

Outputs from the Tobit Model

Once we have run the model, we can use the results for creating parameter values that we might want to use for veryifying our understanding of the Tobit Model.

Parameter Stata Command
\(\sigma\) sqrt(_b[/var(e.y)])
\(log(L)\) e(ll)
\(\beta_{name}\) _b[name]
\(\mathbf{x}\hat{\beta}\) predict xb, xb

Example code using these special variables will create a new variable in your data space:

gen sigma = sqrt(_b[/var(e.y)])
gen logLike = e(ll)
gen beta_x = _b[x]
predict xb, xb

These can be used for replicating a number of the outputs available as postestimation commands. The following table lists the various expected values from the Tobit model, the formula it is based on, and the stata commands for generating marginal effects and predicted values, respectively.

Expected Value Formula Stata Command
\(E[y \shortmid y\hspace{.03in} obs]\) \(\mathbf{x}_i \beta^{T} + \sigma \frac{\phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)}{1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma}\right)}\) margins, predict(e(a,.))
    predict ycond, e(a,.)
\(Prob(Not Censored)\) \(1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)\) margins, pr(a,.)
    predict probobs, pr(a,.)
\(E[y]\) \(\Phi \left (\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right ) a + \left (1-\Phi \left(\frac{a-\mathbf{x}_i \beta^{T}}{\sigma}\right ) \right) \left [\mathbf{x}_i \beta^{T} + \sigma \frac{\phi \left( \frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)}{1-\Phi \left (\frac{a-\mathbf{x}_i \beta^{T}}{\sigma} \right)} \right ]\) margins, predict(ystar(a,.))
    predict y, predict(ystar(a,.)
\(E[y^*]\) \(\mathbf{x}_i \beta^{T}\) margins, xb
    predict xb, xb

Noting that \(\beta^T\) are the estimates of \(\beta\) from the Tobit Model.

Heckman Application

The following code shows how to run a Heckman model using data that has completely non-overlapping variables in \(\mathbf{x}\), the independent variables in the amounts equation and \(\mathbf{w}\) the independent variables in the selection equation.

clear
webuse set "https://rlhick.people.wm.edu/econ407/data"
webuse toy_heckman
sum
(prefix now "https://rlhick.people.wm.edu/econ407/data")

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       index |      5,000      2499.5     1443.52          0       4999
           y |      3,166   -1.504517    .9267993  -4.643568   1.950691
           x |      5,000    .0038611    .9948256  -3.184449   4.071575
           z |      5,000       .6332    .4819795          0          1
           w |      5,000    .0051353    .9901703  -3.564716   3.639859

Note in this dataset, there are missing values for \(\mathbf{y}\) as a result of our selection mechanism.The first five observations looks like this:

list in 1/5

   +--------------------------------------------------+
   | index            y            x   z            w |
   |--------------------------------------------------|
1. |     0            .   -2.9543519   0   -2.1461785 |
2. |     1   -1.4679083    .92509399   1    1.5765399 |
3. |     2            .    .98621375   0    .05838758 |
4. |     3   -1.8399264    1.1407735   1    .70403742 |
5. |     4   -1.4611701    .42070096   1    .23800567 |
   +--------------------------------------------------+

Using this data, we estimate the very simple Heckman Model having only one variable each in the amounts and slection equation.

heckman y x, select(z = w)

Iteration 0:   log likelihood = -6421.6346
Iteration 1:   log likelihood = -6419.2129
Iteration 2:   log likelihood = -6419.1934
Iteration 3:   log likelihood = -6419.1934

Heckman selection model                         Number of obs     =      5,000
(regression model with sample selection)              Selected    =      3,166
                                                      Nonselected =      1,834

                                                Wald chi2(1)      =     611.64
Log likelihood = -6419.193                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
y            |
           x |   .4847046   .0195987    24.73   0.000     .4462918    .5231174
       _cons |  -2.009821    .025109   -80.04   0.000    -2.059034   -1.960608
-------------+----------------------------------------------------------------
z            |
           w |   .9816843   .0262184    37.44   0.000     .9302972    1.033071
       _cons |   .4758687   .0213349    22.30   0.000     .4340531    .5176844
-------------+----------------------------------------------------------------
     /athrho |   1.106114   .0630869    17.53   0.000     .9824655    1.229762
    /lnsigma |   .0147216   .0168441     0.87   0.382    -.0182922    .0477353
-------------+----------------------------------------------------------------
         rho |   .8026843   .0224399                      .7541312    .8425102
       sigma |    1.01483   .0170939                      .9818741    1.048893
      lambda |   .8145885   .0334737                      .7489813    .8801956
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0):   chi2(1) =   214.58   Prob > chi2 = 0.0000

As discussed in class, the command bundles the likelihood ratio test statistic (at the bottom) for testing whether \(\rho=0\) (the OLS Model could be applied) or not (the Heckman Model should be applied).

Outputs from the Heckman Model

Once we have run the model, we can use the results for creating parameter values that we might want to use for verifying our understanding of the Heckman Model.

Parameter Stata Command
\(\sigma\) e(sigma)
\(\rho\) e(rho)
\(log(L)\) e(ll)
\(\beta_{name}\) _b[name]
\(\gamma_{name}\) _b[z:name] (where z is the name of your selection dependent variable)
\(\mathbf{x}\hat{\beta}\) predict xb, xb
\(\mathbf{w}\hat{\gamma}\) predict zg, xbsel

One thing worth noting is that we have two sets of parameters being maintained in the background by stata. To extract individual ones, given our variable names use syntax like this

gen beta_x = _b[x]
di "beta x coefficient: "beta_x
gen gamma_w = _b[z:w]
di "gamma w coefficient:", gamma_w
beta x coefficient: .48470458
gamma w coefficient: .98168427

Usually for this class, you will be working with the linear predictor, so you'll be calculating \(\mathbf{w}\gamma\) and \(\mathbf{x}\beta\) using

predict zg, xbsel
predict xb, xb
Expected Value Formula Stata Command
\(E[y \shortmid y\hspace{.03in} obs]\) \(\mathbf{x}_i \beta^{H} + \rho \sigma_{\epsilon} \frac{\phi(\mathbf{w}_i \gamma)}{\Phi(\mathbf{w}_i \gamma)}\) margins, predict(ycond)
    predict ycond, ycond
\(Prob(Not Censored)\) \(\Phi(\mathbf{w}_i \gamma)\) margins, pr(psel)
    predict probobs, pr(psel)
\(E[y]\) \(\Phi(\mathbf{w}_i \gamma)\left[\mathbf{x}_i \beta^H + \rho \sigma_{\epsilon} \frac{\phi(\mathbf{w}_i \gamma)}{\Phi(\mathbf{w}_i \gamma)}\right]\) margins, predict(yexpected)
    predict y, yexpected
\(E[y^*]\) \(\mathbf{x}_i \beta^{H}\) margins, xb
    predict xb, xb

Noting that \(\beta^T\) are the estimates of \(\beta\) from the Tobit Model.

For example, if we want to predict \(E[y \shortmid y\hspace{.03in} obs]\) for each observation we can just issue this command:

predict ycond, ycond
list y ycond in 1/10
variable ycond already defined
r(110);

     +------------------------+
     |          y       ycond |
     |------------------------|
  1. |          .   -1.771267 |
  2. | -1.4679083   -1.518555 |
  3. |          .   -1.130811 |
  4. | -1.8399264   -1.269633 |
  5. | -1.4611701   -1.473895 |
     |------------------------|
  6. | -1.4463141   -1.561146 |
  7. | -2.0634319   -1.475339 |
  8. | -.82478092   -2.157227 |
  9. | -.75221966   -1.741791 |
 10. |          .   -1.650602 |
     +------------------------+

This shows that we can use the model to predict outcomes assuming everyone is selected.