Rob Hicks (Posts about ipython)http://rlhick.people.wm.edu/enSat, 28 Sep 2024 18:30:54 GMTNikola (getnikola.com)http://blogs.law.harvard.edu/tech/rssThe Gordon Schaefer Modelhttp://rlhick.people.wm.edu/posts/gordon-shaefer-model.html<div><div class="cell border-box-sizing code_cell rendered">
<div class="input">
<div class="prompt input_prompt">In [1]:</div>
<div class="inner_cell">
<div class="input_area">
<div class="highlight hl-ipython3"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pylab</span> <span class="k">as</span> <span class="nn">plt</span>
<span class="kn">import</span> <span class="nn">seaborn</span> <span class="k">as</span> <span class="nn">sbn</span>
<span class="kn">from</span> <span class="nn">scipy.optimize</span> <span class="kn">import</span> <span class="n">fmin</span>
</pre></div>
</div>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Note: this post has been updated for newer versions of python. Content-wise it is mostly identical to the earlier version except for some minor editorial changes. You can interact with this post <a href="https://colab.research.google.com/drive/1qJKw_LPo1Dmykye3PJm3qgixHtDXr4Aq?usp=sharing">live on Google Colab</a>.</p>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h3 id="Fisheries-Simulation-Model">Fisheries Simulation Model<a class="anchor-link" href="http://rlhick.people.wm.edu/posts/gordon-shaefer-model.html#Fisheries-Simulation-Model">¶</a></h3><p>In this notebook, we examine the workings of the Gordon-Schaefer Fisheries Model for a single species.</p>
<p>Denoting $S(t)$ as the stock at time $t$, we can write the population growth function as</p>
$$\frac{\Delta S}{\Delta t} = \frac{\partial S}{\partial t} = r S(t) \left(1- \frac{S(t)}{K} \right)$$<p>where<br>
$S(t)$ = stock size at time $t$<br>
$K$ = carrying capacity<br>
$r$ = intrinsic growth rate of the population</p>
<p><a href="http://rlhick.people.wm.edu/posts/gordon-shaefer-model.html">Read more…</a> (16 min remaining to read)</p></div></div></div></div>bioeconomicecon322ipythonpubp622http://rlhick.people.wm.edu/posts/gordon-shaefer-model.htmlThu, 19 Oct 2023 06:16:50 GMTMake matplotlib histograms look like R'shttp://rlhick.people.wm.edu/posts/make-matplotlib-like-R.html<div><p>
I prefer the look of <code>R</code>'s histograms. This short post pulls together some resources for mimicking R histograms in <code>Matplotlib</code>.
</p>
<p><a href="http://rlhick.people.wm.edu/posts/make-matplotlib-like-R.html">Read more…</a> (3 min remaining to read)</p></div>ipythonmatplotlibRhttp://rlhick.people.wm.edu/posts/make-matplotlib-like-R.htmlSun, 16 Dec 2018 10:30:50 GMTUsing Autograd for Maximum Likelihood Estimationhttp://rlhick.people.wm.edu/posts/mle-autograd.html<div><p>
Thanks to an excellent series of posts on the python package <code>autograd</code> for automatic differentiation by John Kitchin (e.g. <a href="http://kitchingroup.cheme.cmu.edu/blog/2017/11/22/More-auto-differentiation-goodness-for-science-and-engineering/">More Auto-differentiation Goodness for Science and Engineering</a>), this post revisits some earlier work on <a href="http://rlhick.people.wm.edu/posts/estimating-custom-mle.html">maximum likelihood estimation in Python</a> and investigates the use of auto differentiation. As pointed out in <a href="https://arxiv.org/pdf/1502.05767.pdf">this article</a>, auto-differentiation "can be thought of as performing a non-standard interpretation of a computer program where this interpretation involves augmenting the standard computation with the calculation of various derivatives."
</p>
<p>
Auto-differentiation is neither symbolic differentiation nor numerical approximations using finite difference methods. What auto-differentiation provides is code augmentation where code is provided for derivatives of your functions free of charge. In this post, we will be using the <code>autograd</code> package in python after defining a function in the usual <code>numpy</code> way. In python, another auto-differentiation choice is the Theano package, which is used by PyMC3 a Bayesian probabilistic programming package that I use in my research and teaching. There are probably other implementations in python, as it is becoming a must-have in the machine learning field. Implementations also exist in C/C++, R, Matlab, and probably others.
</p>
<p>
The three primary reasons for incorporating auto-differentiation capabilities into your research are
</p>
<ol class="org-ol">
<li>In nearly all cases, your code will run faster. For some problems, much faster.</li>
<li>For difficult problems, your model is likely to converge closer to the true parameter values and may be less sensitive to starting values.</li>
<li>Your model will provide more accurate calculations for things like gradiants and hessians (so your standard errors will be more accurately calculated).</li>
</ol>
<p>
With auto-differentiation, gone are the days of deriving analytical derivatives and programming them into your estimation routine. In this short note, we show a simple example of auto-differentiation, expand on that for maximum likelihood estimation, and show that for problems where likelihood calculations are expensive, or for which there are many parameters being estimated there can be dramatic speed-ups.
</p>
<p><a href="http://rlhick.people.wm.edu/posts/mle-autograd.html">Read more…</a> (8 min remaining to read)</p></div>autogradipythonmaximum likelihoodhttp://rlhick.people.wm.edu/posts/mle-autograd.htmlTue, 06 Mar 2018 08:30:50 GMTEstimating Custom Maximum Likelihood Models in Python (and Matlab)http://rlhick.people.wm.edu/posts/estimating-custom-mle.html<div><p>
In this post I show various ways of estimating "generic" maximum likelihood models in python. For each, we'll recover standard errors.
</p>
<p>
We will implement a simple ordinary least squares model like this
</p>
\begin{equation}
\mathbf{y = x\beta +\epsilon}
\end{equation}
<p>
where \(\epsilon\) is assumed distributed i.i.d. normal with mean 0 and variance \(\sigma^2\). In our simple model, there is only a constant and one slope coefficient (\(\beta = \begin{bmatrix} \beta_0 & \beta_1 \end{bmatrix}\)).
</p>
<p>
For this model, we would probably never bother going to the trouble of manually implementing maximum likelihood estimators as we show in this post. However, for more complicated models for which there is no established package or command, there are benefits to knowing how to build your own likelihood function and use it for estimation. It is also worthwhile noting that most of the methods shown here don't use analytical gradiants or hessians, so are likely (1) to have longer execution times and (2) to be less precise than methods where known analytical gradiants and hessians are built into the estimation method. I might explore those issues in a later post.
</p>
<p>
<b>tl;dr</b>: There are numerous ways to estimate custom maximum likelihood models in Python, and what I find is:
</p>
<ol class="org-ol">
<li>For the most features, I recommend using the <a href="http://rlhick.people.wm.edu/posts/estimating-custom-mle.html#org146a4f5"><code>Genericlikelihoodmodel</code> class from Statsmodels</a> even if it is the least intuitive way for programmers familiar with Matlab. If you are comfortable with object oriented programming you should definitely go this route.</li>
<li>For fastest run times and computationally expensive problems Matlab will most likely be significantly even with lots of code optimizations.</li>
</ol>
<p><a href="http://rlhick.people.wm.edu/posts/estimating-custom-mle.html">Read more…</a> (6 min remaining to read)</p></div>ipythonmatlabmaximum likelihoodhttp://rlhick.people.wm.edu/posts/estimating-custom-mle.htmlSat, 06 May 2017 08:15:50 GMTPart II: Comparing the Speed of Matlab versus Python/Numpyhttp://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy-partii.html<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>In this note, I extend a <a href="http://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy.html">previous post</a> on comparing run-time speeds of various econometrics packages by</p>
<ol>
<li>Adding Stata to the original comparison of Matlab and Python</li>
<li>Calculating runtime speeds by<ul>
<li>Comparing full OLS estimation functions for each package<ul>
<li>Stata: <code>reg</code></li>
<li>Matlab: <code>fitlm</code></li>
<li>Python: <code>regression.linear_model.OLS</code> from the <code>statsmodels</code> module.</li>
</ul>
</li>
<li>Comparing the runtimes for calculations using linear algebra code for the OLS model: $ (x'x)^{-1}x'y $</li>
</ul>
</li>
<li>Since Stata and Matlab automatically parralelize some calculations, we parallelize the python code using the <code>Parallel</code> module.</li>
</ol>
<p><a href="http://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy-partii.html">Read more…</a> (11 min remaining to read)</p></div></div></div>ipythonmatlabhttp://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy-partii.htmlThu, 09 Apr 2015 12:06:21 GMTComparing the Speed of Matlab versus Python/Numpyhttp://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy.html<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p><strong>Update 1</strong>: A more complete and updated speed comparison can be found <a href="http://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy-partii.html">here</a>.</p>
<p><strong>Update 2</strong>: Python and Matlab code edited on 4/5/2015.</p>
<p>In this short note, we compare the speed of matlab and the scientific computing platform of python for a simple bootstrap of an ordinary least squares model. Bottom line (with caveats): matlab is faster than python with this code. One might be able to further optimize the python code below, but it isn't an obvious or easy process (see for example <a href="http://scipy-lectures.github.io/advanced/optimizing/">advanced optimization techniques</a>).</p>
<p>As an aside, this note demonstrates that even if one can't optimize python code significantly enough, it is possible to do computationally expensive calculations in matlab and return results to the ipython notebook.</p>
<h3 id="Data-Setup">Data Setup<a class="anchor-link" href="http://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy.html#Data-Setup">¶</a></h3><p>We will bootstrap the ordinary least squares model (ols) using 1000 replicates. For generating the toy dataset, the true parameter values are
$$
\beta=\begin{bmatrix}
10\\-.5\\.5
\end{bmatrix}
$$</p>
<p>We perform the experiment for 3 different sample sizes ($n = \begin{bmatrix}1,000 & 10,000 & 100,000 \end{bmatrix}$). For each of the observations in the toy dataset, the independent variables are drawn from</p>
$$
\mu_x = \begin{bmatrix} 10\\10 \end{bmatrix}, \sigma_x = \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix}
$$<p>The dependent variable is constructed by drawing a vector of random normal variates from Normal(0,1). Denoting this vector as $\epsilon$ calculate the dependent variable as
$$
\mathbf{Y=Xb+\epsilon}
$$</p>
<p><a href="http://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy.html">Read more…</a> (2 min remaining to read)</p></div></div></div>ipythonmatlabhttp://rlhick.people.wm.edu/posts/comparing-the-speed-of-matlab-versus-pythonnumpy.htmlThu, 19 Mar 2015 12:07:34 GMTTapping MariaDB / MySQL data from Ipythonhttp://rlhick.people.wm.edu/posts/tapping-mysql-ipython.html<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>In this short post, I will outline how one can access data stored in a database like MariaDB or MySQL for analysis inside an Ipython Notebook. There are many reasons why you might want to store your data in a proper database. For me the most important are:</p>
<ol>
<li><p>All of my data resides in a password protected and more secure place than having a multitude of csv, mat, and dta files scattered all over my file system.</p>
</li>
<li><p>If you access the same data for multiple projects, any changes to the underlying data will be propagated to your analysis, without having to update copies of project data.</p>
</li>
<li><p>Having data in a central repository makes backup and recover significantly easier.</p>
</li>
<li><p>This allows for two-way interaction with your database. You can read and write tables from/to your database. Rather than use SQL, you can create database tables using pandas/ipython.</p>
</li>
</ol>
<p><a href="http://rlhick.people.wm.edu/posts/tapping-mysql-ipython.html">Read more…</a> (2 min remaining to read)</p></div></div></div>ipythonmariadbmysqlhttp://rlhick.people.wm.edu/posts/tapping-mysql-ipython.htmlFri, 06 Mar 2015 11:39:36 GMTComparing Stata and Ipython Commands for OLS Modelshttp://rlhick.people.wm.edu/posts/comparing-stata-and-ipython-commands-for-ols-models.html<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>In this note, I'll explore the Ipython <code>statsmodels</code> package for estimating linear regression models (OLS). The goal is to completely map stata commands for <code>reg</code> into something implementable in Ipython.</p>
<p><a href="http://rlhick.people.wm.edu/posts/comparing-stata-and-ipython-commands-for-ols-models.html">Read more…</a> (6 min remaining to read)</p></div></div></div>bootstrapeconometricsipythonstatahttp://rlhick.people.wm.edu/posts/comparing-stata-and-ipython-commands-for-ols-models.htmlMon, 02 Mar 2015 12:15:41 GMTRunning R and Matlab Commands in an Ipython Notebookhttp://rlhick.people.wm.edu/posts/running-r-and-matlab-commands-in-an-ipython-notebook.html<div class="cell border-box-sizing text_cell rendered"><div class="prompt input_prompt">
</div><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>The ipython notebook environment is a superb environment for empirical research. Sometimes, though, you would like to access the capabilities of other software. This post shows how to incorporate R and Matlab into ipython notebooks.</p>
<p><a href="http://rlhick.people.wm.edu/posts/running-r-and-matlab-commands-in-an-ipython-notebook.html">Read more…</a> (2 min remaining to read)</p></div></div></div>ipythonmatlabRhttp://rlhick.people.wm.edu/posts/running-r-and-matlab-commands-in-an-ipython-notebook.htmlThu, 26 Feb 2015 16:06:21 GMT