Git Product home page Git Product logo

stata_kernel's Introduction

stata_kernel

Build Status Downloads Downloads/Month

stata_kernel is a Jupyter kernel for Stata; It works on Windows, macOS, and Linux.

To see an example Jupyter Notebook, click here.

For documentation and more information, see: https://kylebarron.dev/stata_kernel

Jupyter Notebook

Jupyter Notebook

Atom

Atom

stata_kernel's People

Contributors

alanzchen avatar carreau avatar dependabot[bot] avatar gaulinmp avatar kylebarron avatar mcaceresb avatar mcg1969 avatar pacbard avatar ryan-wallace-frbb avatar sikaiser avatar stanmart avatar takluyver avatar tkw1536 avatar willingc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stata_kernel's Issues

Add pygments as a dependency

Code Sample, a copy-pastable example if possible

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

Note: Many problems can be resolved by simply upgrading medicare_utils to the latest version. Before submitting, please check if that solution works for you.

Expected Output

Package Version

Stata syntax highlighting

It might be worth mentioning that in the mean time (since our last chat about it at the end of 2017) I have managed to get my (admittedly hacky) Stata syntax highlighting working again in the Jupyter Notebook.

The codemirror file is here: https://github.com/TiesdeKok/ipystata/blob/49acff7e13716d7baa7cf43be1b6b7c6cef9fc0c/ipystata/stata.js

I linked it to the %%stata magic but I am sure you can also just apply to an entire kernel.

See these lines of code (link):

if config.enable_syntax_highlight:
	# Enable the stata syntax highlighting:
	#js = "IPython.CodeCell.config_defaults.highlight_modes['magic_stata'] = {'reg':[/^%%stata/]};"
	js = """require(['notebook/js/codecell'], function(codecell) {
			  codecell.CodeCell.options_default.highlight_modes['magic_stata'] = {'reg':[/^%%stata/]} ;
			  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){
			      Jupyter.notebook.get_cells().map(function(cell){
			          if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;
			  });
			});"""
	display.display_javascript(js, raw=True)

When code sent to do_execute is not complete, return invalid response

The do_is_complete function prevents the Jupyter kernel from sending input until there's a valid, complete chunk of code. But when Atom sends code, there's no way to probe for more user input, so the selection goes directly to do_execute. Because of this, I can't assume that the code sent to do_execute is necessarily complete, and I need to check.

Allow displaying local by sending just local

If I have a local in my environment named x and set to 5, and I send

x

to Stata,

  1. If the selection sent is a single identifier
  2. Check if the selection matches a program, first with which program then by matching against program dir.
  3. Check if selection is name of a local or global. Might be able to do this without pinging Stata again by using environment state created for completions.
  4. If a local or global, run di `local'. Might also do this for r-class or e-class?

Note: Add check_env flag to run function so that I only check environment when needed.

Make sure this is an option in the configuration file

Notes

See #22.
If I use just \r\n\. as the regex, if I run di ". text", pexpect gets confused because it's looking for the next linesep + dot + space as the prompt, but the first one it finds is the result, and not the next line prompt. I believe all prompts have a full empty line before them, so I made \r\n\r\n\. the regex to find the dot prompt, but this failed on Linux on my machine whenever it did a shell command. !ls would have the \x1b ANSI escape code returned from the shell, and wouldn't match the double linesep. The lookbehind instead of just matching them leaves an extra newline between results. Otherwise the next row of [2]: would be right up against [1]:

No, exit doesn't work. This is because pexpect is waiting for a response from the spawned Stata console, but that has already closed. I need to figure out how to fix this. See #7

I plan to refactor the code a good amount. I'd say the code is relatively clean for having written it quickly, but I want to step back a minute and try to think through an optimal structure for the package. I'm thinking of moving all the code that deals with validating input into a code_manager.py file (see #32 for initial thoughts). Hopefully that will parse input and not send code to Stata unless it's sure that it can be run safely (once I turn off the timeout option, things will run forever if invalid input is sent to the console). Ideally this would also use tokens for autocompletions, though I'll probably put that into a separate file.

Then a separate run.py file for sending the validated code to Stata and retrieving output. I'm happy to have devised a function that allows me to abstract the differences between sending code with Windows Automation vs Mac Applescript. But there's still differences on each platform with retrieving output. There are some issues with retrieving graphs, especially, and apparently Stata 14 doesn't export SVG?

Make configuration editable inline with magics

In IPython, there's

%config InlineBackend.figure_format = 'svg'

Make something similar for Stata. I.e.

%config figure_format = 'svg'

This should have the same options that are available in the global configuration file. This should set also set platform-specific defaults during the install script.

Various ways the kernel is tripped because of changing prompts

Code Sample

disp "Sup?", _request(what_is_up)

pause on
program foo
    pause
    disp "bar"
end
foo

`local'

$global

Problem description

Each of the 4 examples above messes up the kernel because the prompt changes.

Package Version

Name: stata-kernel
Version: 1.1.0
Summary: A Jupyter kernel for Stata. Works with Windows, macOS, and Linux. Preserves program state.
Home-page: https://github.com/kylebarron/stata_kernel
Author: Kyle Barron
Author-email: [email protected]
License: GPLv3
Location: /usr/lib/python3.6/site-packages
Requires: jupyter-client, IPython, ipykernel, python-dateutil, pexpect
Required-by: 

Allow magics

Can I import some from IPython directly? Like %time?

Tokens issue

@mcaceresb

The specific issue here is that strings are tokenized differently than text. With my current code joining method, I was thinking there would be two different Token types, Token.Text and Token.MatchingBracket.Other, where the latter is a contiguous set of tokens that comprise a block (and which thus must be sent all at once).

Obviously I forgot that I had also defined the token type Token.Literal.String. Since this is a different name than the earlier two token types, the loop parsed it into a different line.

I need to leave the string token type so that it catches comments, and also within blocks:
image

This is the part where it's superbly annoying that there isn't a way with Pygments to show the entire token stack. It only shows the topmost token, so I can't see (I don't think) that above the Token.Literal.String.Double is inside a Token.MatchingBracket.Other.

I think the main way to solve this is to set the token name for strings that occur within a block to Token.MatchingBracket.Other

Provide a way for console users to get back in sync

Add a note to readme about di _n(2) ". " messing it up. Think if there's a way to get pexpect to sync up. Maybe keep looking for \r\n\. with a .5 sec timeout and return when you get one or two timeouts in a row.

Use single cache directory for logs, images

Code Sample, a copy-pastable example if possible

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

Note: Many problems can be resolved by simply upgrading medicare_utils to the latest version. Before submitting, please check if that solution works for you.

Expected Output

Package Version

Consider "flushing dots" before a DoCommandAsync

A current issue is that log files from DoCommandAsync might have different starting positions based on whether the previous command was a DoCommand or a DoCommandAsync. Consider always running a DoCommandAsync with di "flush log" first.

Support keyboard interrupts

Is your feature request related to a problem? Please describe.
Keyboard interrupts (Ctrl+C) should be supported whenever possible. In the console version of Stata this is akin to hitting the "break" button. This is generally useful but would also help with a common mistake (that at least I make): Sometimes I put a command that would have Stata printing stuff to the console for several minutes, and Ctrl+C is very helpful to break that.

Describe the solution you'd like
The following works for me in run_shell but I haven't tested it too thoroughly.

for line in lines:
    try:
        self.child.sendline(line)
        self.child.expect('(?<=(\r\n)|(\x1b=))\r\n\. ', timeout=20)
    except KeyboardInterrupt:
        self.child.send('\003')
        self.child.expect('(?<=(\r\n)|(\x1b=))\r\n\. ', timeout=20)

Describe alternatives you've considered
N/A

Additional context
N/A

Fix ANSI terminal issues with pexpect

With the current expect regex, some commands, notably shell fail. This is because there are some escape codes that are caught in the output, and so there isn't a \r\n\r\n ..

With the expect regex being \r\n ., I see:
image

Fix exit behavior

When I type exit, it seems pexpect is waiting for some response when there will never be one

code_manager.py file

This file should have a class that accepts an unmodified input string and returns a list of run-able code blocks. In general each item of the list should be a single line of Stata code, except when that line of Stata code would not return to a dot prompt.

class CodeManager(object):
    def __init__(self, code):
        self.input = code
        self.tokens = self.tokenize_code()

    def tokenize_code(self):
        # Do tokenizing with Pygments
        # Return a list of tokens

    def is_complete(self):
        # Analyze tokens
        # Determine if all of input is complete.
        # Return simple True/False

    def completions(self, cursor_pos):
        # Unclear whether it's necessary to tokenize the code for this.
        # Return dict with correct Jupyter message

    def tokens_to_lines():
        # Form logical code chunks from tokens
        # This should also remove comments
        # Returns a list of code chunks that are safe to run in the console

Windows support

First of all, awesome work! I will try this kernel out on my Linux workstation soon and let you know if I run into any issues.

I have been doing a little bit of research into Windows support but it seems that the Windows version does not have an interactive console mode (only the full GUI version or the console batch mode). It does appear that it is possible to install Stata without GUI, but I have not been able to get this to work yet (e.g. https://www.stata.com/support/faqs/windows/install-from-command-line/).

Another option would be to include Stata Automation integration for Windows users (like I did with ipystata). I can probably add this for you once I get back from holiday in a couple of weeks.

I will also definitely consider adding similar Linux (and MacOS) functionality into ipystata, by the looks of it this approach using pexpect seems to work nicely.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.