Update 1: A more complete and updated speed comparison can be found here.
Update 2: Python and Matlab code edited on 4/5/2015.
In this short note, we compare the speed of matlab and the scientific computing platform of python for a simple bootstrap of an ordinary least squares model. Bottom line (with caveats): matlab is faster than python with this code. One might be able to further optimize the python code below, but it isn't an obvious or easy process (see for example advanced optimization techniques).
As an aside, this note demonstrates that even if one can't optimize python code significantly enough, it is possible to do computationally expensive calculations in matlab and return results to the ipython notebook.
Data Setup¶
We will bootstrap the ordinary least squares model (ols) using 1000 replicates. For generating the toy dataset, the true parameter values are
$$
\beta=\begin{bmatrix}
10\\-.5\\.5
\end{bmatrix}
$$
We perform the experiment for 3 different sample sizes ($n = \begin{bmatrix}1,000 & 10,000 & 100,000 \end{bmatrix}$). For each of the observations in the toy dataset, the independent variables are drawn from
$$
\mu_x = \begin{bmatrix} 10\\10 \end{bmatrix}, \sigma_x = \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix}
$$
The dependent variable is constructed by drawing a vector of random normal variates from Normal(0,1). Denoting this vector as $\epsilon$ calculate the dependent variable as
$$
\mathbf{Y=Xb+\epsilon}
$$
Read more…