hugetim / nbstata Goto Github PK
View Code? Open in Web Editor NEWA Jupyter kernel for Stata built on pystata
Home Page: https://hugetim.github.io/nbstata/
License: GNU General Public License v3.0
A Jupyter kernel for Stata built on pystata
Home Page: https://hugetim.github.io/nbstata/
License: GNU General Public License v3.0
They're not being displayed for some reason, passing silently with then only the "pystata module not found" error showing in the notebook.
There should be a way for the user to determine, from within a notebook, what configuration file is in effect.
This is concerning the Notebook (not Jupyter Lab, which is handled by a separate package). nbstata
currently doesn't show any syntax highlighting within the Notebook interface:
Yet the same notebook downloaded as "PDF via LaTeX" option produces a pdf document with syntax highlighting that is very close to stock Stata colors:
Is it possible to have the syntax highlights applied to the pdf output within Notebook interface?
(On a side note, is it possible to get rid of those <IPython.core.display.HTML object>
lines?)
I wonder if there is any way to add a "run" functionality to nbstata when using it in other environments. Specifically Quarto
Right now, I have been able to make it work using "jupyter:nbstata". However, I was wondering if allowing for a "run" option depends on the kernel and nbstata, or rather is something to do with Quarto,
Thank you
%help
without a command name listed after it causes cell execution to fail, rather than displaying a helpful error message. This traceback shows up in the console:
-[IPKernelApp] ERROR | Exception in message handler:
Traceback (most recent call last):
File "C:\Users\tjhuegerich\AppData\Local\Continuum\anaconda3\envs\Python2022-05\lib\site-packages\ipykernel\kernelbase.py", line 406, in dispatch_shell
await result
File "C:\Users\tjhuegerich\AppData\Local\Continuum\anaconda3\envs\Python2022-05\lib\site-packages\ipykernel\kernelbase.py", line 721, in execute_request
reply_content = self.do_execute(
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\kernel.py", line 126, in do_execute
code_cell = Cell(self, code, silent)
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\cell.py", line 22, in __init__
self.code = magic_handler.magic(code_w_magics, kernel, self)
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\magics.py", line 178, in magic
code = self._do_magic(name, code, kernel, cell)
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\magics.py", line 164, in _do_magic
return getattr(self, "magic_" + name.lstrip('%'))(code, kernel, cell)
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\magics.py", line 425, in magic_help
'text/html': self._get_help_html(code),
File "c:\users\tjhuegerich\onedrive - lrca\nbstata\nbstata\magics.py", line 383, in _get_help_html
soup.find('h2').decompose()
AttributeError: 'NoneType' object has no attribute 'decompose'
Hi Tim,
Quick question. Is there anything special one would need to do so nbstata could work with Stata18?
Thank you!
F
Would it be straightforward to add this functionality? As in @ticoneva's pystata-kernel:
I tried to look into the documentation but couldn't find information about the following.
Is it possible to do inline coding?
for example something like this inline.
So that if I type stata e(r2)
(or similar) to display the desired outcome? (Print R2 of a regression, in line)
Thank you
Hi Tim,
Just a quick question (which i have also sent to quarto dev).
Not sure if you kept exploring quarto, but i encounter an odd problem when rendering documents straight from Stata.
consider this
---
title: "Untitled"
format:
html: default
ipynb: default
pdf: default
jupyter: nbstata
---
First setup
```{stata}
clear
ssc instal jwdid
set linesize 250
set seed 111
set sortseed 111
set obs 100 // <- 100 units
gen id = _n
gen ai = rchi2(2)
// determines When would units receive treatment
gen g = runiformint(2,10)
replace g = 0 if g>9 // never treated
expand 10 // <-T=10
bysort id:gen t=_n
gen event = max(0,t-g)
gen aux = runiform()*2
bysort t:gen at = aux[1] // Determines Time fixed effect
gen te = (1-g/10)+(1-event/10)
// Treatment effect but vanishes with time
gen eit= rnormal()
gen trt = (t>=g)*(g>0)
gen teff = te * trt
gen y = ai + at + te * trt + eit
```
then 2 regressions
```{stata}
qui: jwdid y, ivar(i) tvar(t) gvar(g)
estat simple
```
but also
```{stata}
qui: jwdid y, ivar(i) tvar(t) gvar(g) never
estat simple
```
So when I render this into HTML i get two outcomes from the regressions:
a good one
------------------------------------------------------------------------------
| Delta-method
| Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_at |
(2 vs 1) | 1.543749 .1567784 9.85 0.000 1.236469 1.851029
------------------------------------------------------------------------------
but also one misaligned
------------------------------------------------------------------------------
| Delta-method
| Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_at |
(2 vs 1) |
1.458058 .1494798 9.75 0.000 1.165083 1.751033
------------------------------------------------------------------------------
Do you have any idea why would this happen?
Is it possible to get "browse" to work as it did with stata_kernel (i.e. open Stata's native browser in a new window)? I'm still using the Hydrogen package in Atom to run my code, so the currently-implemented %browse magic doesn't work for me.
As can be seen in the output for cell [11] here, the %help
magic output exhibits excessive spacing between lines.
Let's revise the html or CSS to remove that extra space.
Relevant files to edit:
https://github.com/hugetim/nbstata/blob/master/nbs/09_magics.ipynb
https://github.com/hugetim/nbstata/blob/master/nbstata/css/_StataKernelHelpDefault.css
Possibly relevant:
jupyter/notebook#6173
Small issue with the docs. The link in the user guide is broken: (missing Stata's version in pystata18)
graph_format: Acceptable values are ‘png’ (the default), ‘pdf’, ‘svg’ and ‘pystata’. Specify the last option if you want to use pystata’s default setting.
graph_format: Acceptable values are ‘png’ (the default), ‘pdf’, ‘svg’ and ‘pystata’. Specify the last option if you want to use pystata’s default setting.
graph_format: Acceptable values are ‘png’ (the default), ‘pdf’, ‘svg’ and ‘pystata’. Specify the
last option if you want to use pystata’s [default setting](https://www.stata.com/python/pystata18
/config.html#pystata.config.set_graph_format).
In exploratory analysis I will sometimes do this:
preserve
generate somevar = somecommand
* and various other commands
and run the cell, the intention being to inspect the results with a %browse
or similar.
But if I run a cell like this, and then from another cell try to summarize somevar
, I get a variable not found error. Similarly, if I restore
, I get "nothing to restore, r(622);".
The only way I can make sense of this is if nbstata is using preserve
/restore
itself, and unwinding my preserve
behind the scenes. If so it might be a good idea to detect preserve
in user input and throw an error.
nbstata
0.6.2 on Jupyter Notebook produces the following warning message:
/Users/gaksaray/anaconda3/envs/nbstata/lib/python3.10/site-packages/nbformat/__init__.py:128: MissingIDFieldWarning: Code cell is missing an id field, this will become a hard error in future nbformat versions. You may want to use `normalize()` on your notebooks before validations (available since nbformat 5.1.4). Previous versions of nbformat are fixing this issue transparently, and will stop doing so in the future.
validate(nb)
/Users/gaksaray/anaconda3/envs/nbstata/lib/python3.10/site-packages/notebook/services/contents/manager.py:353: MissingIDFieldWarning: Code cell is missing an id field, this will become a hard error in future nbformat versions. You may want to use `normalize()` on your notebooks before validations (available since nbformat 5.1.4). Previous versions of nbformat are fixing this issue transparently, and will stop doing so in the future.
validate_nb(model['content'])
It doesn't seem to cause any problems but it might be worth checking at some point.
IPython : 8.10.0
ipykernel : 6.19.2
ipywidgets : 8.0.4
jupyter_client : 8.1.0
jupyter_core : 5.3.0
jupyter_server : 1.23.4
jupyterlab : 3.5.3
nbclient : 0.5.13
nbconvert : 6.5.4
nbformat : 5.7.0
notebook : 6.5.3
qtconsole : 5.4.0
traitlets : 5.7.1
Here's a fun one.
Try this:
%%echo
python:
try:
print("This works!")
except:
pass
end
Then try this:
%%noecho
python:
try:
print("This raises an IndentationError.")
except:
pass
end
I assume that the behind-the-scenes conversion nbstata is doing to make the cells into program
s is doing something to the indentation that Python doesn't like.
A solution might be a %%python
magic, which dumps the cell to a temporary .py
file and runs it using pystata.stata.run("python script tempfile.py")
, so the python still runs in the inside-Stata python context instead of the kernel environment?
(My use case is trying to offload fetching and parsing some remote JSON to python before passing it back to Stata, since Stata's JSON-handling tools are so terrible.)
Hi,
If nbstata works only with STATA 17+, why the help pages showed in the animated gif are for STATA 15?
For example, a cell with just /*
on the first line causes the cell to fail upon run. (No error message. Cell number remains [*]
.)
I wonder if you'd consider publishing an nbstata conda-forge package? Noting that the installation instructions already recommend using conda for jupyterlab, at the moment if I'm installing nbstata into a fresh conda environment I have to either:
pip install nbstata
; orpip install nbstata
and have a bunch of dependencies that conda doesn't manage.Thanks in advance!
Use literal "default"s for pystata.config.set_graph_size for nbstata
's default. This will probably require a breaking change to the API, but it's worth it.
Originally posted by friosavila May 11, 2023
Hi Tim
I was wondering if there is anyway to modify how nbstata works regarding figures.
I know that currently, you can change the setup if you would like to obtain larger or smaller figures to be output in jupyter notebook.
However, would it be possible to use information from "xsize() and ysize()" to be used to change those settings in a more dynamic way?
Thank you!
Quarto is blowing my mind. It seemed impenetrable as I browsed randomly through the reference docs to find answers to particular questions. But then I finally broke down and...worked through the tutorials. 🌩💡
Working through the tutorials will be a bigger challenge for Stata users without experience in any of the tutorial reference languages, so I'd like to add a Stata Quarto tutorial to the docs here soon.
Originally posted by @hugetim in #11 (comment)
At the moment, the %browse
magic only operates on the current frame, which is consistent with Stata itself.1
However, because the %browse
magic only works on the first line of the cell, if you want to look at the contents of another frame, you have to either use the notebook console to switch frames, or create a cell containg cwf otherframe
and then a second cell containing %browse
, which is both messy and easy to lose track of.
It would be great to be able to explicitly specify a frame as a one-liner, e.g. %browse if somevariable == 1, frame(someframe)
.
I don't know if frame someframe: browse
is supposed to work; it doesn't throw a prefix-not-allowed error, but it ignores the frame specifier and always browses the current frame. ↩
Add something about this to the User Guide, and maybe also about other 'nbstata` utilities that may be helpful when working in a Python notebook:
When you want to use Stata within a Python notebook (as opposed to a Stata notebook), 'nbstata' provides a convenient way to load Stata so that the %%stata
magic will work. For most users (when a config file is not needed to locate the Stata directory), the following lines will do it:
import nbstata
nbstata.launch_stata()
Compare Method 1 in the official pystata docs. (All the methods suggested by Stata require you to type out the Stata directory and edition in the code to initialize pystata.)
nbstata
on Jupyter Notebook, is it possible to have a language definition for local macros such that closing ticks are automatically inserted (as in Stata do-file editor). Currently, there is autocompletion for double and single quotes, but not Stata macros:
What's worse is when the user inserts closing macro tick, Jupyter automatically adds another spurious single quote, which must be deleted manually.
Release notes have been minimal up to this point. A proper CHANGELOG, at least going forward from the most recent version, would be helpful to provide guidance on the relative importance of particular updates.
The XDG Base Directory specification states that config files should be located in $XDG_CONFIG_HOME
, which should default to $HOME/.config
.
Can we move nbstata.conf
in e.g. $XDG_CONFIG_HOME/nbstata/nbstata.conf
, while still reading from the old path ($HOME/.nbstata.conf
) for backwards compatibility?
There is currently a bug in the setup: it doesn't install the logo and css assets.
I'm not sure how to contribute to the user guide.
I wanted to write to make sure that the stata
command (that is, the CLI to Stata, not to be confused with xstata
, which is the GUI) works, otherwise nbstata
won't work.
And that if you are on Arch Linux and you get this error when running stata
:
stata: error while loading shared libraries: libncurses.so.5: cannot open shared object file: No such file or directory
You need to install ncurses5-compat-libs from the AUR to fix it.
I get the following error/warning message using nbstata
on Jupyter Notebook:
404 GET /static/components/codemirror/mode/stata/stata.js?v=20230404164645 (::1) 0.560000ms referer=http://localhost:8888/notebooks/Untitled.ipynb
even after the notebook is trusted:
I'm not sure if this is related to nbstata
or a generic Notebook issue.
Kernel seems fail / not start when these config options from example in docs are included in nbstata.conf:
graph_width = 15cm
graph_height = 11cm
the error printed in notebook:
ModuleNotFoundError: pystata path not found
A Stata 17 installation is required to use the nbstata Stata kernel. If you already have Stata 17 installed, please specify its path in your configuration file.
Based on the documentation I tried other units too but with same failed result:
graph_width = 720px
graph_height = 480px
Version of nbstata used:
nbstata 0.6.2
The only option was to omit them, which is fine for now, but generally should we expect this to work?
Hi Tim,
I recently noticed a small problem (not sure if its nbstata or pystata)
When copying a chunk of code from a dofile into a quarto document, and run the code, some problems emerge if the code starts with a tabulation (for long codes).
This can be easily fixed, however if one uses Spaces instead of Tabs
Minor problem, but may bite more when transferring from one Do to Quarto (and jupyter?)
Thank you
F
Steps to reproduce:
sysuse auto
Run
%head
fails with
head failed.
must be real number, not str
However, omitting the variable rep78
(which has missing data)
%head make price mpg headroom trunk weight length turn displacement gear_ratio foreign
Note: I am getting this error on a campus jupyterlab instance and believe the version of nbstata is fairly new as this was implemented last week, but can't verify at the moment.
I was trying to use nbstata interactively in RStudio but I cant get it to work as expected. Based on the example document from friosavila. First the chunks never finish running, and if I click the Stop button, then it returns an error (see screenshots below for more context).
The error is because the do
command before the filename
is missing. But one could expect the error to be returned more quickly.
. /tmp/Rtmpc5adhy/chunk-code-9d3d86c3c832d.txt
/ is not a valid command name
r(199);
The whole document is rendered fine though.
The document was created with jupyter engine and nbstata kernel.
I am also able to use nbstata kernel in jupyter notebooks, both interactively and also to compile the whole document. I might revert to this solution for the time being, but it would be great to have it working in Rstudio because I find it's notebook implementation very superior to the jupyter's interface.
Please let me know if there is anything I could try or am doing wrong. I am happy to debug and then document it for future users.
[✓] Checking versions of quarto binary dependencies...
Pandoc version 3.1.1: OK
Dart Sass version 1.55.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
Version: 1.3.450
Path: /home/avila/opt/quarto-1.3.450/bin
[✓] Checking basic markdown render....OK
[✓] Checking Python 3 installation....OK
Version: 3.12.0
Path: /usr/bin/python3
Jupyter: 5.5.0
Kernels: python3, nbstata
() Checking Jupyter engine render....0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[✓] Checking Jupyter engine render....OK
[✓] Checking R installation...........OK
Version: 4.3.2
Path: /usr/lib64/R
LibPaths:
- /home/avila/R/x86_64-redhat-linux-gnu-library/4.3
- /usr/local/lib/R/library
- /usr/lib64/R/library
- /usr/share/R/library
knitr: 1.45
rmarkdown: 2.25
[✓] Checking Knitr engine render......OK
See the parallel issue at: ticoneva/pystata-kernel#25
Version 1.0 may just get rid of the --prefix
install option. Meanwhile, I think the config file location should not be affected by the specified --prefix
, which should instead only affect the kernel installation location.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.