Git Product home page Git Product logo

polyglottus's Introduction

Polyglottus

Code blocks and markup interacting in a single script in a consistent format for literate programming.

Overview

Polyglottus will be a format for writing a script that contains blocks in different languages, including markup languages, that will run in multiple kernels which can import data from each other. A script of this type should be able to be imported into a notebook (or run as a notebook given the proper infrastructure), or simply run at the command line to output a notebook-like document.

Reason

I love using Jupyter notebooks and use them quite a lot. I have taught workshops with them. But I would also like to have something similar that is a little more programmatic. A notebook is less a program or a document than a serialized client "session." I also very much like R-Markdown for it's programmatic and textual simplicity. Then there is R-Spin, and more recently Nteract.

All of these are attempts at literate programming that emphasize code and documentation in different ways. Some are more flexible than others, and some are more rational than others. For instance, although hydrogen (nteract for the Atom editor) seems close to what I want to do, it is still "markup first", and does not have a robust syntax to define cells.

This is my own attempt. My idea is that only things in cells (blocks, chunks, ...) are executed, processed, or rendered. specified block. Anything outside of a block is a comment, and not rendered. Thus code and documentation are on completely equal footing, and all blocks (or cells) are delineated in the same way. Importing and exporting variables between cells, even with different kernels, is also done in a consistent manner (this is a feature that Jupyter lacks).

Formats

I played around with a couple of different ideas for how to delineate code blocks. I've settled on a syntax hat combines the magics syntax from IPython/Jupyter and the Matlab cell syntax. To start a cell you begin a line with %% followed by a processor name. This is the name of the kernel or other method to parse, execute, and/or render what is in the block. Including a tilde "~" directly after the processor name tells the parser that a section of metadata will follow (not married to the tilde, was just my first hit). The metadata is in YAML format and sepcifies such things as variables imported from other kernels, stylesheets, whether or not to display code and/or output, etc. A code cell is closed with /%%.

Example:

This is is ignored because it is not in a cell

%%kerneldefs
r1 = kernel(language="R", kernel_name="ir")
p1 = kernel(languag="Python", kernel="IPython")
m1 = kernel(language="Markedown", renderer="markdown.js")
/%%

%%r1
a <- 10
/%%

%%m1~
------
imports:
  - r1:
      data: a
stylesheet: sleek
------
This is some markdown. In the kernel _r1_ the variable
`a` has the value {{a}}.
/%%

%%p1~
------
imports:
  - r1:
      maximum: a
------
j = 1
for i in range(maximum):
  j *= i
  print(i)
/%%

%%r1
cat(a)
/%%

%%p1
print("maximum! = " + str(j))
/%%

Importing Data

You will be able to import data from one kernel into another for a limited number of data types. For starters, these will be an extended set of the usual primitive types:

int, float, string, array, dict, table

Import equivalencies can be defined:

data.table <==> DataFrame (Python), Object (JS), TSV (bash)
string <==> string
int <==> int
Array <==> List, Array, etc.
Dict <==> JS object (JS), nested named lists (R), YAML, XML
etc.

Status

I currently have a working parser class polylparser.py which successfully parses the file sample.poly.

I also have a working wrapper for the the jupyter_client in simple_kernel.py that allows me to create kernels in multiple languages. A separate test of running code in diferent kernels (currently have tested Python, R, bash, javascript, typescript, and nodejs) is multi_kernel_test.py.

Short-Term ToDo

  • Choose a format and write a parser
  • Create a parser class
  • Combine simple_kernel.py, polyparser.py, and multi_kernel_client.py into a prototype of parsing an actual .poly file.
  • Learn how to create kernels in languages other than Python.
  • Learn to use more of the jupyter_client API such as the JSON payloads described in the documentation.
  • [ ]

polyglottus's People

Contributors

abalter avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.