#+BEGIN_COMMENT .. title: Using stata_kernel and Emacs Orgmode for reproducible research goodness .. slug: stata_kernel_emacs .. date: 2020-07-07 10:31:50 UTC+01:00 .. tags: stata, orgmode .. status: .. has_math: yes .. category: .. link: .. description: .. type: text #+END_COMMENT This post is hopefully the last in a series of posts outlining how to use =Stata= in a proper dynamic document/reproducible research setting using Emacs. As of the summer of 2020, I am only using [[https://github.com/kylebarron/stata_kernel][=stata_kernel=]] for my own work and no longer recommend using [[./stata-and-literate-programming-in-emacs-org-mode.html][my customized =ob-ipython.el=]] for reasons described [[https://gitlab.com/robhicks/ob-stata.el/-/issues/10#note_374134363][here]]. This post shows the installation steps to get this working and some usability recommendations if using Org-mode. Before proceeding with anything below, make sure you complete the [[./stata_kernel_jupyterlab.html#Python Preliminaries]["Python Preliminaries" steps]] first. #+HTML: The following will give you a quick idea of how things work once things are working properly: [[https://raw.githubusercontent.com/roblem/stata_kernel/master/docs/src/img/emacs-orgmode-jupyter-stata-demo.gif]] * Running Stata Commands in Emacs #+BEGIN_SRC emacs-lisp :exports none :eval never-export (setf (cdr (assoc :results org-babel-default-header-args:ipython)) "output replace") #+END_SRC #+RESULTS: : output replace Once you have setup the python environment following the steps above, do this in emacs: 1. Install and load [[https://github.com/dzop/emacs-jupyter][emacs-jupyter.el]] 2. Ensure that you have activated the python environment where =stata_kernel= is available 3. Add the following lines to your =init.el=: #+BEGIN_SRC emacs-lisp :exports code :eval never-export (when (functionp 'module-load) (use-package jupyter) (with-eval-after-load 'org (org-babel-do-load-languages 'org-babel-load-languages '((jupyter . t)))) (with-eval-after-load 'jupyter (define-key jupyter-repl-mode-map (kbd "C-l") #'jupyter-repl-clear-cells) (define-key jupyter-repl-mode-map (kbd "TAB") #'company-complete-common-or-cycle) (define-key jupyter-org-interaction-mode-map (kbd "TAB") #'company-complete-common-or-cycle) (define-key jupyter-repl-interaction-mode-map (kbd "C-c C-r") #'jupyter-eval-line-or-region) (define-key jupyter-repl-interaction-mode-map (kbd "C-c M-r") #'jupyter-repl-restart-kernel) (define-key jupyter-repl-interaction-mode-map (kbd "C-c M-k") #'jupyter-shutdown-kernel) (add-hook 'jupyter-org-interaction-mode-hook (lambda () (company-mode) (setq company-backends '(company-capf)))) (add-hook 'jupyter-repl-mode-hook (lambda () (company-mode) :config (set-face-attribute 'jupyter-repl-input-prompt nil :foreground "black") :config (set-face-attribute 'jupyter-repl-output-prompt nil :foreground "grey") (setq company-backends '(company-capf)))) (setq jupyter-repl-prompt-margin-width 4))) ;; associated jupyter-stata with stata (fixes fontification if using pygmentize for html export) (add-to-list 'org-src-lang-modes '("jupyter-stata" . stata)) (add-to-list 'org-src-lang-modes '("Jupyter-Stata" . stata)) ;; you **may** need this for latex output syntax highlighting ;; (add-to-list 'org-latex-minted-langs '(stata "stata")) #+END_SRC Additionally, remove =("ipython" . "ipython")= and =("stata" . "stata")= from ='org-babel-load-languages= in your =init.el= (if you have =ob-ipython= installed). * Usage Stata code blocks need to look like this: #+BEGIN_SRC jupyter-stata :session stata :exports code :eval never-export ,#+BEGIN_SRC jupyter-stata :session stata :kernel stata sysuse auto sum ,#+END_SRC #+END_SRC #+RESULTS: #+begin_example (1978 Automobile Data) Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- make | 0 price | 74 6165.257 2949.496 3291 15906 mpg | 74 21.2973 5.785503 12 41 rep78 | 69 3.405797 .9899323 1 5 headroom | 74 2.993243 .8459948 1.5 5 -------------+--------------------------------------------------------- trunk | 74 13.75676 4.277404 5 23 weight | 74 3019.459 777.1936 1760 4840 length | 74 187.9324 22.26634 142 233 turn | 74 39.64865 4.399354 31 51 displacement | 74 197.2973 91.83722 79 425 -------------+--------------------------------------------------------- gear_ratio | 74 3.014865 .4562871 2.19 3.89 foreign | 74 .2972973 .4601885 0 1 #+end_example Note the header arguments "=jupyter-stata :session stata=". The session name (in this case "stata") can be anything you'd like but can't be missing. Running this code yields both code with syntax highlighting and output: #+BEGIN_SRC jupyter-stata :session stata :results output :exports both :eval never-export sum price trunk headroom #+END_SRC #+RESULTS: : : (1978 Automobile Data) : : : Variable | Obs Mean Std. Dev. Min Max : -------------+--------------------------------------------------------- : price | 74 6165.257 2949.496 3291 15906 : trunk | 74 13.75676 4.277404 5 23 : headroom | 74 2.993243 .8459948 1.5 5 Display the first 5 observations using the R-like head magic: #+BEGIN_SRC jupyter-stata :session stata :results output :exports both :eval never-export %head 5 if price > 3000 #+END_SRC #+RESULTS: #+begin_export html
make price mpg rep78 headroom trunk weight length turn displacement gear_ratio foreign
1 AMC Concord 4099 22 3 2.5 11 2930 186 40 121 3.5799999 Domestic
2 AMC Pacer 4749 17 3 3 11 3350 173 40 258 2.53 Domestic
3 AMC Spirit 3799 22 . 3 12 2640 168 35 121 3.0799999 Domestic
4 Buick Century 4816 20 3 4.5 16 3250 196 40 196 2.9300001 Domestic
5 Buick Electra 7827 15 4 4 20 4080 222 43 350 2.4100001 Domestic
#+end_export Note: In your =Org-Mode= buffer the above table doesn't display nicely (since by default it returns =html=). You might want to use the =:display text/plain= header argument while you are developing your document. #+BEGIN_SRC jupyter-stata :session stata :results output :exports both :display text/plain bstrap: regress price mpg headroom trunk #+END_SRC #+RESULTS: #+begin_example (running regress on estimation sample) Bootstrap replications (50) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 Linear regression Number of obs = 74 Replications = 50 Wald chi2(2) = 15.48 Prob > chi2 = 0.0004 R-squared = 0.2272 Adj R-squared = 0.2054 Root MSE = 2629.1564 ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based price | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -259.1057 67.92036 -3.81 0.000 -392.2271 -125.9842 headroom | -334.0215 318.7159 -1.05 0.295 -958.6932 290.6503 _cons | 12683.31 2209.929 5.74 0.000 8351.933 17014.7 ------------------------------------------------------------------------------ #+end_example ** Displaying and Exporting Graphics One notable "gotcha" that has always been an issue is that =state_kernel= uses the console version (on linux) of =stata= which is fully functional with one exception: **stata cannot output png files** when displaying or exporting graphics. =stata_kernel= sidesteps this by producing =svg= graphics along with =pdf= graphics files for each figure displayed in the notebook. This causes some difficulties that vary depending on what we are looking for (showing figures inline in =emacs=, exporting to =html=, or exporting to =pdf=). I gather that these issues aren't relevant for Windows (not sure about MAC). #+BEGIN_SRC jupyter-stata :session stata :exports none :eval never-export set scheme plotplainblind #+END_SRC #+RESULTS: If we wish to display a histogram of price in the Emacs buffer, we can execute this: #+BEGIN_SRC jupyter-stata :session stata :eval never-export :exports code ,#+BEGIN_SRC jupyter-stata :session stata hist price ,#+END_SRC #+END_SRC #+RESULTS: :RESULTS: : : Unknown #command : (bin=8, start=3291, width=1576.875) #+attr_org: :width 600 :height 400 [[file:./.ob-jupyter/c5e484552042d4b4ea7e9d9ed8a1d2e585833719.svg]] : This front-end cannot display the desired image type. : : : Unknown #command :END: #+BEGIN_SRC jupyter-stata :session stata :eval never-export :exports results hist price #+END_SRC #+RESULTS: :RESULTS: : (bin=8, start=3291, width=1576.875) #+attr_org: :width 600 :height 400 [[file:./.ob-jupyter/c5e484552042d4b4ea7e9d9ed8a1d2e585833719.svg]] : This front-end cannot display the desired image type. :END: While the =html= export you are viewing above isn't a good result, in the Emacs buffer we will always see a displayed image in buffer: [[../site_pics/stata_kernel_inbuffer.png]] Exporting to =html= is the primary issue with this method. Additionally, we have the warning message =: This front-end cannot display the desired image type.= This is because the results are returning a pdf of the image which orgmode can't deal with. We can eliminate this by running the magic #+BEGIN_SRC jupyter-stata :session stata :exports both %set graph_svg_redundancy False #+END_SRC *** A robust approach for viewing and exporting Graphics To sidestep this problem and have a more general solution, I suggest the following strategy: continue to use the results from codeblock execution to view figures inside Emacs, but also save them to disk and then reference them manually in orgmode for more robust exporting. #+BEGIN_SRC jupyter-stata :session stata :exports code ,#+BEGIN_SRC jupyter-stata :session :kernel stata :exports code hist price graph export "/tmp/hist.svg", replace ,#+END_SRC #+END_SRC #+RESULTS: :RESULTS: : : (bin=8, start=3291, width=1576.875) #+attr_org: :width 600 :height 400 [[file:./.ob-jupyter/c5e484552042d4b4ea7e9d9ed8a1d2e585833719.svg]] : This front-end cannot display the desired image type. : : : (file /tmp/hist.svg written in SVG format) : :END: Then we can manually add a link to this file in our orgmode document via ~[[/tmp/hist.svg]]~, to include the histogram in a way that should be robust to whatever document type we wish to export to. It is worth noting that in the Emacs buffer, you will likely see the image twice (one in the results object that we aren't exporting, and one for the manual link you've created). You can turn off the second of these by toggling =org-toggle-inline-images=. [[../site_pics/hist.svg]] This method has the added benefit of better/customized placement for figures using =org-mode= =#+attr_html= or =#+attr_latex= directives. * Conclusion This post shows how to use =stata_kernel= with Emacs. The method outlined here is superior to [[./stata-and-literate-programming-in-emacs-org-mode.html][one that uses an updated version of =ob-stata.el= and Emacs Speaks Statistics (=ESS=) that I wrote about over a year ago]] as Stata support there has been deprecated for current releases and my modified script no longer works (ie. > Summer 2020). Even the somewhat inconvenient way of dealing with graphical output is no worse than what was required before.