Comments (13)
I've spent a little bit of time looking into Markdown flavors that might be of interest and wanted to update this issue with some more perspective.
First off, I think this is the main takeaway (it is a suggestion, not a directive): our build system should support a strict subset of Pandoc markdown or RMarkdown. It could do this in addition to another language like rST.
Why support Pandoc Markdown?
I spoke with a few folks at RStudio which is the main driver behind RMarkdown. This is the flavor of markdown supported by bookdown. It has proven resilient and quite popular in the R community, and supports many of the features that we'd need for publishing. RMarkdown is a subset of Pandoc markdown (here is a summary of pandoc markdown and here is the announcement that RMarkdown is a subset of pandoc markdown). This means that if we were to support the same subset of Pandoc Markdown, then we'd be supporting a language that is already utilized by a huge community of people.
How would we support this?
Here is a useful post with information on what we'd need to do to support this.
I think there are two options to supporting pandoc markdown.
- First is to write a direct parser that goes from Pandoc markdown -> a docutils AST. This could be, for example, by building on top of the recommonmark project, which does this for the base "commonmark" flavor of markdown.
- Second is to build a bridge between markdown and rST, such as the m2r package. This is also what nbsphinx does to support reading Jupyter notebooks.
In either case, we'd want to define a subset of Pandoc markdown syntax that we wish to support, and then create a mapping from that subset onto either docutils objects or rST. This could be a standalone Sphinx extension that would be really useful for the community outside of just this project.
As a side note, here's an interesting post about the differences between pandoc MD vs rST
@chrisjsewell has already made a really interesting implementation of this approach here: executablebooks/meta#12
A few caveats
There are a few potential pitfalls to this...here are some that I can think of:
- This could get edge-casey, depending on how much of the Pandoc or RMarkdown world we want to support.
- Perhaps we should think purely in form of the within-cell markdown flavor we wish to support, and then farm out the representation of a notebook in that markdown flavor to something like jupytext.
- We should do some profiling to see if this would add significant overhead to the build times
What about directives and roles?
The biggest question to my mind is what to do about directives and roles (if we are using Sphinx under the hood). These are one of the most powerful features in Sphinx, and something we could take advantage of to extend new features for books. But, there are no native "directives and roles" features in markdown.
One idea would be to piggy-back on Pandoc markdown syntax for these.
Directives
In rST directives look like this:
.. mydirective:
:myparam1: myval1
:myparam2: myval2
In markdown, this might be utilized with Pandoc syntax.
For example, Pandoc allows you to separate <div>
elements with fences like so:
::: mydiv
# Some markdown
inside the div
:::
and optionally:
::: myotherdiv {.myattribute}
:::
Perhaps the pattern of ::: something
could be mapped onto directives in rST. For example, something like:
::: toctree {maxdepth=1}
* page1
* page2
:::
Roles
For in-line markup, we could use the "bracketed spans" syntax from Pandoc markdown. This is intended to making custom "span" elements in your text like so:
[This is *some text*]{.class key="val"}
However, we could piggy-back on this by defining some specific roles, e.g.:
I'm now linking to a [different document]{doc=anotherPage} which contains [this equation]{eq=myeqID}. And also for [references]{ref=mybibtexref}.
Curious what folks think about that...
I'll update this issue if I can think of some other things to consider...
from myst-parser.
@rowanc1 good point about needing a language to do the rendering. Another possibility is to piggy-back on Jupyter for some of this. E.g., there are some interesting JS tools that use a Binder kernel under the hood to add interactivity backed by a Python/R/whatever kernel:
I also wanted to ping @stefanv and @rossbar who might have thoughts on markdown and its use in a publishing system. I believe that Elegant Scipy is written entirely in markdown, and they're hoping to keep that content in (more or less) the same markup language
from myst-parser.
Was going through the CommonMark forums and found an interesting comment from JGM re: extension syntax in markdown: https://talk.commonmark.org/t/support-for-extension-token/2771/7
from myst-parser.
This means that if we were to support the same subset of Pandoc Markdown, then we'd be supporting a language that is already utilized by a huge community of people.
This is a serious advantage, and could lead to complementarities in building tools and training.
This is the flavor of markdown supported by bookdown. It has proven resilient and quite popular in the R community, and supports many of the features that we'd need for publishing
Yes. Also, the bookdown extensions also should be seriously considered. When I went through executablebooks/meta#11 stuff, it seemed to have a solution for everything I had done through jupinx.
from myst-parser.
Have people here heard of Idyll? It is an interactive markdown syntax, which might be relevant here? I have been working on an editor/renderer for this sort of content which is quite similar (https://components.ink/). Ink is written in html
directly, so less relevant, but perhaps some of the ideas/schema of bringing interactivity into the markdown might be?
I find this style of interactivity very cool in having the text documents themselves react directly to interaction. This allows for "scalable documents" as well as text that can update directly. This also allows for embedding variables directly in the prose. A few images below to show what I mean.
I have been giving this style of interactive document quite a bit of thought over the last few months (albeit outside of the Jupyter ecosystem) and can expand if people are curious. Having components of this be backed by/interoperable with Jupyter would be quite exciting.
from myst-parser.
@rowanc1 that kind of functionality would be awesome to have. I've always loved the documents that connect the text with the outputs in an interactive fashion.
Thinking through how to support more complex features (like the neat linking stuff mentioned above), I did a little thought experiment about how to include "directives" and "roles" in markdown. If we supported this, it would let us extend the language to interesting features like the ones that @rowanc1 describes. I added a section with to brainstorms for "directives and roles in markdown" above, would love to hear what people think.
As an example of how the Pandoc syntax I suggested above might work, you could accomplish the basic idyll example here:
# Hello World
[var name:"x" value:5 /]
The value of x is [Display value:x format:"d" /].
[Range value:x min:0 max:10 /]
With something like this:
# Hello world
::: var {name="x" value=5} :::
The value of x is []{display_value=x format="d"}.
::: range {value=x min=0 max=10} :::
from myst-parser.
The directives and roles look pretty promising! A few other thoughts if you go down this route.
Scopes
One of the important things that I have seen is the introduction of variable scopes
so that you can maintain state in a section of the document. That is, not everything lives in the global document namespace, you can section them off ([Display value:scope1.x /]
or the Pandoc equivalent). This is really important in larger documents or in referencing into a scope that you are reusing/importing. I think when connecting this with other computational kernels that also becomes quite important. You may have some client-side presentational calculations (format etc.) - and that should be able to execute without necessarily talking to a computation server.
Transformations
Another issue going down this path is the language of small calculations/transformations. For example, one of the examples I have used is to have the text say "free" when price
is equal to zero. This requires you to determine the language that the transformation is written in. In my case (i.e. in Ink), I have chosen javascript, as I believe this is probably the main (if only?) presentational environment that will be dynamic. This may present some (small) complications for the rendering pipeline (i.e. you need a node environment to evaluate variables).
From Ink.
Web components
I went down a path in 2017 of creating a parser for what I called .xmd
extensible markdown. I got a parser and a bit of a spec going, but it was brittle and I basically gave up on extending markdown (I think the larger community involvement here changes that calculation). The next approach I took was going into web components, which allows you to define XML components for a browser to parse and display. For example, the variable declaration [var name:"x" value:5 /]
becomes <ink-var name="x" value="5" />
. I have a full comparison here that might jog some other thinking if you decide to go down this path.
Using web components means the markup output (e.g. from this project) is completely declarative and there should be a 1:1 mapping between the properties in markdown and the attributes in XML (which is important for any round-trip considerations). I think this is quite exciting as the toolchain developed here can be completely separate from the rendering side - and the project is about the standards of what the properties, etc. are called, and less about the rendering implementation (e.g. the js library you choose to import).
Let me know if you want me to expand on any of this, or put these thoughts somewhere else! Excited to see where this project goes.
from myst-parser.
I've always loved the documents that connect the text with the outputs in an interactive fashion.
For sure. But can't we already do that with packages and extensions?
I think it is pretty hard to get that working in a language-neutral (especially if we want a bijective transformation to ipynb). I have a lot of success with https://github.com/JuliaGizmos/Interact.jl for example in Julia but those sorts of features are tied into the particular language and package.
from myst-parser.
Correct, Elegant Scipy was written in markdown, though not all of the desired features (labels/cross-referencing, etc.) were supported by the particular build system (comprising notedown and nbconvert).
From my perspective, markdown makes a lot of sense for a publishing system that aims to support "non-expert" users; i.e. someone like Jane from the user personas. Jupyter, GitHub, GitLab, etc. are very popular and people who use these tools have been exposed to markdown already, so a limited superset of new syntax that provides the necessary features for scientific publishing seems like a natural way to appeal to a lot of potential users --- learning a few new things in a language you have some familiarity with is a lot less daunting than learning a whole new language.
I am certainly no expert when it comes to tooling for scientific publishing, and have not thought as deeply about as many others have (cf. the many interesting ideas and informative issues/PRs in this repo). I aim to convert/add elements to Elegant Scipy with pandoc/rmarkdown to have a concrete test case that is relevant for other upcoming textbook projects. I expect this process and exploring various conversion/build tools with the resulting document will be enlightening as to what features are truly important for our publication needs moving forward.
from myst-parser.
Another potentially relevant point - if we run into performance issues with parsing markdown etc documents, then we could look into some other parsers for this.
E.g. here are commonmark parsers in several languages
- Haskell (in-progress): https://github.com/jgm/commonmark-hs/tree/master/commonmark
- Javascript: https://github.com/commonmark/commonmark.js/
- C: https://github.com/commonmark/cmark and this related Python wrapper: https://github.com/PavloKapyshin/paka.cmark
- Python: https://github.com/miyuchina/mistletoe
- Python: https://github.com/readthedocs/commonmark.py (what recommonmark uses)
I think all will convert to an AST rather than doing a direct-to-HTML conversion, which might mean we could piggy-back on it?
from myst-parser.
With the additional Rendering
and Execution
layer that Jupyter provides -- it will be important to keep in the front of mind the difference between underlying Text
representation and the Rendering
representation and where each of those elements are produced (i.e. through a build parser or a supporting extension for the notebook etc.).
The way we have been thinking about this recently is:
Text Syntax (i.e. Markdown) <-> IPYNB(as JSON)
The uniqueness of the notebook is that it has both representations. IPYNB(as JSON) a machine readable text representation and IPYNB(as Rendered HTML) as a finished product. Hopefully we can also make a human readable version of IPYNB(as JSON) for direct text representations.
from myst-parser.
This is an interesting discussion on common-mark
https://talk.commonmark.org/t/generic-directives-plugins-syntax/444
from myst-parser.
Note - I opened up a thread in the Jupyter community forum to see if people have thoughts about text-based standards: https://discourse.jupyter.org/t/should-jupyter-recommend-a-text-based-representation-of-the-notebook/3273/9 (that thread is focused on a text-based representation of a notebook structure, not changing the flavor of markdown that notebooks support)
from myst-parser.
Related Issues (20)
- Allow `caption` for code fences HOT 1
- Anchor links only working with extension HOT 1
- 2.0.0: documentation build fails with `'Module' object has no attribute 'doc'`
- configuration via `pyproject` HOT 3
- Add support for GitLab slugger
- Duplicate label warning from headings in module docstrings HOT 1
- Allow for links links and titles in block quote attributions HOT 9
- Anchor links to other markdown files produces " WARNING: local id not found in doc..." HOT 8
- DOC BUG: Content child 1 and Content child 2 pages look off HOT 1
- Warning "local id not found in doc" in translated docs since MyST-parser 0.19.0 HOT 4
- Add an extension to support GitHub alerts HOT 2
- Linking to documentation heading
- Numbered headings (for example starting with 1.) are not translated with Sphinx HOT 1
- `parsed-literal` does not preserve code-highlighting HOT 2
- Typo in the documentation
- Open external URLs in a new tab HOT 2
- Create a new release to capture the docutils warnings fixes HOT 2
- Cannot link to RST sections since 0.17.0 HOT 4
- 404 Page not found on "MyST - Markedly Structured Text - Parser" page HOT 1
- inline attribute on hyperlink disappears in LaTeX (but is there in HTML) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from myst-parser.