kylebarron / stata_kernel Goto Github PK

View Code? Open in Web Editor NEW

262.0 12.0 56.0 20.1 MB

A Jupyter kernel for Stata. Works with Windows, macOS, and Linux.

Home Page: https://kylebarron.dev/stata_kernel/

License: GNU General Public License v3.0

Python 78.58% Stata 2.00% CSS 0.85% HTML 8.09% Shell 0.26% JavaScript 10.23%

stata jupyter jupyter-kernels jupyter-notebook

stata_kernel's Introduction

stata_kernel

stata_kernel is a Jupyter kernel for Stata; It works on Windows, macOS, and Linux.

To see an example Jupyter Notebook, click here.

For documentation and more information, see: https://kylebarron.dev/stata_kernel

Jupyter Notebook

Atom

stata_kernel's People

Contributors

Stargazers

Watchers

stata_kernel's Issues

Jupyter Notebook doesn't work on Mac using Automation

Add pull request and issue templates

This file should have a class that accepts an unmodified input string and returns a list of run-able code blocks. In general each item of the list should be a single line of Stata code, except when that line of Stata code would not return to a dot prompt.

class CodeManager(object):
    def __init__(self, code):
        self.input = code
        self.tokens = self.tokenize_code()

    def tokenize_code(self):
        # Do tokenizing with Pygments
        # Return a list of tokens

    def is_complete(self):
        # Analyze tokens
        # Determine if all of input is complete.
        # Return simple True/False

    def completions(self, cursor_pos):
        # Unclear whether it's necessary to tokenize the code for this.
        # Return dict with correct Jupyter message

    def tokens_to_lines():
        # Form logical code chunks from tokens
        # This should also remove comments
        # Returns a list of code chunks that are safe to run in the console

Support #delimit ;

Basically have two sections of the lexer. One for ; and one for cr

When code sent to do_execute is not complete, return invalid response

The do_is_complete function prevents the Jupyter kernel from sending input until there's a valid, complete chunk of code. But when Atom sends code, there's no way to probe for more user input, so the selection goes directly to do_execute. Because of this, I can't assume that the code sent to do_execute is necessarily complete, and I need to check.

Only depend on pexpect on Mac/Linux

https://stackoverflow.com/questions/6469508/is-it-possible-to-express-a-platform-specific-dependency-in-setup-py-without-bui

https://stackoverflow.com/questions/16055403/setuptools-platform-specific-dependencies/32955538#32955538

Use pygments to decide if entered code is complete

Code introspection

Allow magics

Can I import some from IPython directly? Like %time?

Support completions

Try to automatically find Stata executable

Add line continuation indent to loop lines

Have to add the line continuation spaces after the . .

Don't hold program for program drop, only for program define

Tokens issue

@mcaceresb

The specific issue here is that strings are tokenized differently than text. With my current code joining method, I was thinking there would be two different Token types, Token.Text and Token.MatchingBracket.Other, where the latter is a contiguous set of tokens that comprise a block (and which thus must be sent all at once).

Obviously I forgot that I had also defined the token type Token.Literal.String. Since this is a different name than the earlier two token types, the loop parsed it into a different line.

I need to leave the string token type so that it catches comments, and also within blocks:

This is the part where it's superbly annoying that there isn't a way with Pygments to show the entire token stack. It only shows the topmost token, so I can't see (I don't think) that above the Token.Literal.String.Double is inside a Token.MatchingBracket.Other.

I think the main way to solve this is to set the token name for strings that occur within a block to Token.MatchingBracket.Other

Add yapf/flake 8/pytest to dev dependencies

Fix ANSI terminal issues with pexpect

With the current expect regex, some commands, notably shell fail. This is because there are some escape codes that are caught in the output, and so there isn't a \r\n\r\n ..

With the expect regex being \r\n ., I see:

Allow keyboardinterrrupt

See: https://github.com/jrfiedler/stata-kernel/blob/28e36743db5212875f5315331d34f7722be24407/stata_kernel/stata_kernel.py#L90-L96

Delete .stata_kernel_images folder on kernel shutdown

Documentation Website

Mkdocs 1.0!

See:
https://aqueduct.io/docs/

Provide a way for console users to get back in sync

Add a note to readme about di _n(2) ". " messing it up. Think if there's a way to get pexpect to sync up. Maybe keep looking for \r\n\. with a .5 sec timeout and return when you get one or two timeouts in a row.

Fix exit behavior

When I type exit, it seems pexpect is waiting for some response when there will never be one

Fix line continuation for `program`

From program or program define until end, there should be line continuations

Text wrapping with long input/output using shell

Various ways the kernel is tripped because of changing prompts

Code Sample

disp "Sup?", _request(what_is_up)

pause on
program foo
    pause
    disp "bar"
end
foo

`local'

$global

Problem description

Each of the 4 examples above messes up the kernel because the prompt changes.

Package Version

Name: stata-kernel
Version: 1.1.0
Summary: A Jupyter kernel for Stata. Works with Windows, macOS, and Linux. Preserves program state.
Home-page: https://github.com/kylebarron/stata_kernel
Author: Kyle Barron
Author-email: [email protected]
License: GPLv3
Location: /usr/lib/python3.6/site-packages
Requires: jupyter-client, IPython, ipykernel, python-dateutil, pexpect
Required-by:

Images

Helpful example from here
https://ipython-books.github.io/16-creating-a-simple-kernel-for-jupyter/

Allow displaying local by sending just local

If I have a local in my environment named x and set to 5, and I send

to Stata,

If the selection sent is a single identifier
Check if the selection matches a program, first with which program then by matching against program dir.
Check if selection is name of a local or global. Might be able to do this without pinging Stata again by using environment state created for completions.
If a local or global, run di `local'. Might also do this for r-class or e-class?

Note: Add check_env flag to run function so that I only check environment when needed.

Make sure this is an option in the configuration file

Windows support

First of all, awesome work! I will try this kernel out on my Linux workstation soon and let you know if I run into any issues.

I have been doing a little bit of research into Windows support but it seems that the Windows version does not have an interactive console mode (only the full GUI version or the console batch mode). It does appear that it is possible to install Stata without GUI, but I have not been able to get this to work yet (e.g. https://www.stata.com/support/faqs/windows/install-from-command-line/).

Another option would be to include Stata Automation integration for Windows users (like I did with ipystata). I can probably add this for you once I get back from holiday in a couple of weeks.

I will also definitely consider adding similar Linux (and MacOS) functionality into ipystata, by the looks of it this approach using pexpect seems to work nicely.

Add execution mode to banner

Use single cache directory for logs, images

Code Sample, a copy-pastable example if possible

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

Note: Many problems can be resolved by simply upgrading medicare_utils to the latest version. Before submitting, please check if that solution works for you.

Expected Output

Package Version

Create cross-session history file

https://jupyter-client.readthedocs.io/en/stable/wrapperkernels.html#MyKernel.do_history

Update github issue templates

Change python code to stata

Stata 14 support

I suspect that the nomsg parameter was added with the most recent version Stata (i.e. 15).

When trying to execute code using Stata 14 I get an error:

stata_kernel/stata_kernel/kernel.py

Line 297 in 2d408bc

code = 'log using `"{}"\', replace text nomsg{}{}'.format(

Fix svg path for Automation

At least windows, test Mac

Failing examples for removing comments

Add pygments as a dependency

Code Sample, a copy-pastable example if possible

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

Note: Many problems can be resolved by simply upgrading medicare_utils to the latest version. Before submitting, please check if that solution works for you.

Expected Output

Package Version

Notes

See #22.
If I use just \r\n\. as the regex, if I run di ". text", pexpect gets confused because it's looking for the next linesep + dot + space as the prompt, but the first one it finds is the result, and not the next line prompt. I believe all prompts have a full empty line before them, so I made \r\n\r\n\. the regex to find the dot prompt, but this failed on Linux on my machine whenever it did a shell command. !ls would have the \x1b ANSI escape code returned from the shell, and wouldn't match the double linesep. The lookbehind instead of just matching them leaves an extra newline between results. Otherwise the next row of [2]: would be right up against [1]:

No, exit doesn't work. This is because pexpect is waiting for a response from the spawned Stata console, but that has already closed. I need to figure out how to fix this. See #7

I plan to refactor the code a good amount. I'd say the code is relatively clean for having written it quickly, but I want to step back a minute and try to think through an optimal structure for the package. I'm thinking of moving all the code that deals with validating input into a code_manager.py file (see #32 for initial thoughts). Hopefully that will parse input and not send code to Stata unless it's sure that it can be run safely (once I turn off the timeout option, things will run forever if invalid input is sent to the console). Ideally this would also use tokens for autocompletions, though I'll probably put that into a separate file.

Then a separate run.py file for sending the validated code to Stata and retrieving output. I'm happy to have devised a function that allows me to abstract the differences between sending code with Windows Automation vs Mac Applescript. But there's still differences on each platform with retrieving output. There are some issues with retrieving graphs, especially, and apparently Stata 14 doesn't export SVG?

Error in shell with program define

Add stoponerror option

Support keyboard interrupts

Is your feature request related to a problem? Please describe.
Keyboard interrupts (Ctrl+C) should be supported whenever possible. In the console version of Stata this is akin to hitting the "break" button. This is generally useful but would also help with a common mistake (that at least I make): Sometimes I put a command that would have Stata printing stuff to the console for several minutes, and Ctrl+C is very helpful to break that.

Describe the solution you'd like
The following works for me in run_shell but I haven't tested it too thoroughly.

for line in lines:
    try:
        self.child.sendline(line)
        self.child.expect('(?<=(\r\n)|(\x1b=))\r\n\. ', timeout=20)
    except KeyboardInterrupt:
        self.child.send('\003')
        self.child.expect('(?<=(\r\n)|(\x1b=))\r\n\. ', timeout=20)

Describe alternatives you've considered
N/A

Additional context
N/A

I linked it to the %%stata magic but I am sure you can also just apply to an entire kernel.

See these lines of code (link):

if config.enable_syntax_highlight:
	# Enable the stata syntax highlighting:
	#js = "IPython.CodeCell.config_defaults.highlight_modes['magic_stata'] = {'reg':[/^%%stata/]};"
	js = """require(['notebook/js/codecell'], function(codecell) {
			  codecell.CodeCell.options_default.highlight_modes['magic_stata'] = {'reg':[/^%%stata/]} ;
			  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){
			      Jupyter.notebook.get_cells().map(function(cell){
			          if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;
			  });
			});"""
	display.display_javascript(js, raw=True)

Check out pexpect's repl class

https://pexpect.readthedocs.io/en/latest/api/replwrap.html

Make configuration editable inline with magics

In IPython, there's

%config InlineBackend.figure_format = 'svg'

Make something similar for Stata. I.e.

%config figure_format = 'svg'

This should have the same options that are available in the global configuration file. This should set also set platform-specific defaults during the install script.

kylebarron / stata_kernel Goto Github PK

stata_kernel's Introduction

stata_kernel

Jupyter Notebook

Atom

stata_kernel's People

Contributors

Stargazers

Watchers

Forkers

stata_kernel's Issues

Code Sample

Problem description

Package Version

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Package Version

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Package Version

Recommend Projects

Recommend Topics

Recommend Org