Comments (23)
Behold the first Markdown directive parser!
See the bottom of https://github.com/chrisjsewell/mistletoe/blob/myst/test.ipynb
Current format is:
````{note}
abcd *abc* [a](link)
```{warning}
xyz
```
````
which is transformed to docutils AST:
<document source="">
<note>
<paragraph>
abcd
<emphasis>
abc
<pending_xref refdomain="True" refexplicit="True" reftarget="link" reftype="any" refwarn="True">
<reference refuri="link">
a
<warning>
<paragraph>
xyz
FYI for all the tests (which are extensive) see: https://travis-ci.org/chrisjsewell/mistletoe
from myst-parser.
from myst-parser.
Very cool. I see you're picking up line numbers corresponding to cells in the AST. So ticking all the boxes already in terms of what was needed...
from myst-parser.
This looks totally awesome!
A quick question: should it be possible to configure default directive/role akin to sphinx? These could use blank {}
for example.
from myst-parser.
You've worked hard!!!
I personally find the YAML syntax far more readable than {name key=value}
when there are multiple options. But opinion will be split on that point.
Regarding the YAML syntax, could you do
```{image} path/to/image
---
height: 20
width: 40
---
Here is a *caption*.
```
That seems a bit more symmetric --- and hence easy to remember.
from myst-parser.
Yeh that could also work ta.
Implemented roles and math as well now (no option key/val parsing yet). It actually could end up being more powerful than RST in some respects, because you can nest inline elements, which isn't possible in RST:
````{note}
abcd *abc* [a](link)
```{warning}
xyz
```
````
```{figure}+ path/to/image
height: 40
---
Caption
```
**{code}`` a=1{`} ``**
**$a=1$**
$$b=2$$
`` a=1{`} ``
goes to:
<document source="">
<note>
<paragraph>
abcd
<emphasis>
abc
<pending_xref refdomain="True" refexplicit="True" reftarget="link" reftype="any" refwarn="True">
<reference refuri="link">
a
<warning>
<paragraph>
xyz
<figure>
<image height="40" uri="path/to/image">
<caption>
Caption
<paragraph>
<strong>
<literal classes="code">
a=1{`}
<paragraph>
<strong>
<math>
a=1
<paragraph>
<math_block xml:space="preserve">
b=2
<paragraph>
<literal>
a=1{`}
from myst-parser.
@choldgraf @mmcky @jstac @AakashGfude I've added the Sphinx Parser π
You just install my fork of mistletoe (pip install -e .[sphinx,testing]
, on the myst
branch), and add extensions = ["mistletoe"]
to your conf.py
and it will pick up all the .md
files.
Note if you look in myst/test/test_sphinx/test_sphinx_builds.py, I have set up automated testing of sphinx builds, for folders in myst/test/test_sphinx/sourcedirs. So if you run that with pytest it will actually generate the _build
folders (comment out the remove_sphinx_builds
fixture, so that they are not removed at the end of the test).
from myst-parser.
A suggestion from John Macfarlane:
Sincewe've been talking about dedicated syntax that would map on to a directive, but wouldn't be confusable with code blocks, use what RMarkdown and Pandoc do and use {}
for "special" inline or block literals, something like:
```{mydirective}
This is
my special section
literal
```
We could assume that any code blocks that had curly brackets were block-level directives, and reference the first element in the {}
against our list of directives. If it doesn't exist, fall back to assuming it is just an attribute.
This would also be fairly parsable in other markdown parsers, since the {}
pattern is quite common, and we wouldn't introduce any extra syntax. Also we could then still use
```language
This is
my language syntax
```
from myst-parser.
also - just a note for @rowanc1 here, I feel like if we end up using Sphinx and have a directive / role syntax for markdown, then maybe that's a place where components.ink pieces could be inserted into content at build time by writing a role/directive that injects the proper JS and HTML into the page (maybe as a separate sphinx extension?) curious what you think about that...
from myst-parser.
My interpretation of how this would apply is:
Some happy text.
```{ink-scope name=scope1}
``{ink-var name=x value=2}
My variable $x=$``{ink-display name=x}.
```
I am putting the scope
in there as an example. It gets a bit messy, especially if you have multiple block directives. For example, styling an input as a callout box.
A couple of questions:
- Would indentation or raw html input be allowed?
- Any thoughts on empty content for inline elements?
For example: indentation
{ink-scope name=scope1}:
``{ink-var name=x value=2}
{ink-callout kind=info}:
Variable $x=$``{ink-display name=x}.
For example: html
<ink-scope name="scope1">
``{ink-var name=x value=2}
Variable $x=$``{ink-display name=x}.
</ink-scope>
And that would either be ignored in other representations - or perhaps if you have an intermediate AST then it could last until there? I liked the comment you posted about the C markdown parser coming to a common xml representation that can be acted upon.
from myst-parser.
So I've added testing against most of the docutils directives (see here), and added parsing of arguments, e.g.
```{image} path/to/image
```
The last part is to parse options. It has been mentioned about parsing like ```{name key=value}
, but a major problem with this is it would break the current code fence regex, which looks for a string with no spaces for the language component (I also don't think it looks very nice).
I think the YAML block is the best way and I was thinking, for efficient parsing, it would be good to signify in the first line if the block contains options. Something like:
(note the +
)
```{image}+ path/to/image
height: 20
width: 40
---
Here is a *caption*.
```
Then it would read everything as YAML until either a ---
is found or the end of the block is reached.
from myst-parser.
@choldgraf FYI front-matter does start with ---
(see here), so it makes sense in the directives to also do this, which I've now changed to:
```{name} argument text
---
option: 1
---
content with *markdown* **syntax**
```
from myst-parser.
Love your work @chrisjsewell. Outstanding.
from myst-parser.
Duuude - it works! So cool! Tonight I'll try making a little sphinx documentation site in your myst branch using the content that @AakashGfude put together...I am curious how it'll look!
from myst-parser.
(Chris pointed me to those discussions; I am an extensive sphinx user due to being one of the maintainers of the pandas docs, which is a quite big sphinx site. And I am excited about the issues you are tackling here: I love sphinx, but I also love to see improvements to it ;))
One thing I am wondering: to what extent are you already set on the syntax for roles and directives?
It seems you are now taking the syntax for code (both for inline and blocks) with adding a role/directive name in the {}.
This is closer to existing markdown syntax, so I can imagine this is easier to extend an existing parser for this? (and it's also closer to things in the existing standard / pandoc, which are very good reasons)
But thinking about some usecases for roles in the documentation projects I am working with, and I think something along the lines of the generic directives syntax proposal might be easier to work with (as an end user):
Small example rst snippet:
We can link to :meth:`pandas.DataFrame` in the API reference
or to another section :ref:`here <label>` (:issue:`1234`).
How it might look like based on the role examples above (the details might not be correct):
We can link to `pandas.DataFrame`{meth} in the API reference
or to another section `here`{ref, id=label} (`1234`{issue}).
And how it might look like with the linked proposal:
We can link to :meth[pandas.DataFrame] in the API reference
or to another section :ref[here]{label} (:issue[1234]).
Personally, I think the third snippet "looks" better than the second (but that's very subjective of course. Maybe that's because I am so used to having colons in rst .. ;-))
But maybe a slightly more objective argument: I think having the role name come first, instead of in the end, improves readability. And it also gives more contrast with actual code snippets.
from myst-parser.
I think having the role name come first, instead of in the end, improves readability.
Yep that how it has now been implemented, as {name}`content`
. I guess the issue with using square brackets, is that they are not degradable when using a standard Markdown parser; with backticks the content will remain raw text, whereas in brackets it will be treated as Markdown.
Also with colons, this might clash with the potential syntax extension of field lists . For example, if you want to be able to use the :orphan:
metadata token.
from myst-parser.
This is great stuff, thanks @chrisjsewell!
Wondering about that yaml header: if you use two ---
lines, that takes up the majority of space in the fenced block. Can you think of any risk in removing the first instance? I couldn't immediately see a downside.
```{name} argument text
---
--- and any such arbitrary text
β```
Quickly surveying the landscape: in pandoc, the yaml blocks are surrounded by ---
and ...
respectively (no idea why); Hugo uses matching ---
; org-mode uses #+VARIABLE_NAME: value
.
from myst-parser.
@stefanv I believe the main reason for this is because otherwise the regex search can become really expensive.
Imagine that you have lots of code blocks with parameters inside. Because ---
is also valid markdown, you need to figure out if the ---
is there because it is the break between YAML config and the content, or if it is just regular markdown ---
. So you have to do some more complex search to figure it out.
If you know there's a character that defines "this is the start of config" then it becomes much easier, so adding a starting ---
makes this trivial to figure out, at the cost of extra verbosity.
After using it a bit, I think a way we could get around this issue is to also support some kind of arguments in the first line, and suggest that people use this only if they have a very small number of arguments. Then if the number of args is non-trivial (maybe > 2 or so) they can use the YAML, and if they number of args is small they can keep it close to a one-liner.
from myst-parser.
Another option would be to denote that arguments section with a special character on each line. For example, "parameters can be provided by starting a line with :
at the beginning of the content block. E.g.:
```{directive}
:key: val
:key2: val2
:arg3:
:key4: |
Val 4
Content
```
That would have the benefit of even more parity w/ rST. For a very short paragraph then you'd have something like:
```{code-block} python
:linenos:
My content
```
from myst-parser.
Yeh as I've noted in #24, I think I will add in a block token for docutils field list syntax, which I didn't actually realise before was part of the RST spec. Then you should be able to use:
```{name} arguments
:option: a
:non-kwarg:
Content
```
from myst-parser.
@chrisjsewell is the idea that this would replace the YAML parsing? Or just be an option? I quite like the YAML syntax. Instead of allowing full rST syntax could we just say that if the block starts with lines that begin with :
then those will be parsed as YAML lines? (AKA it is just a shorthand to avoid requiring the ---
fences?)
from myst-parser.
Yeh I donβt think Iβm going to add actual parsing for these field lists any more; in favour of just using YAML. But yeh for directives you could maybe include that alternative approach.
from myst-parser.
I think it'd be helpful to include the :
short-hand for metadata. That way there are basically two options for YAML metadata, depending on whether you care about conciseness. As an example we could recommend:
If there are <= 2 configuration lines:
```{directivename}
:key: true
:key2: config2
```
If there are >=2 configuration lines:
```{directivename}
---
key: true
key2: config2
key3: config3
key4: |
Multi line
config
---
```
Either would be valid, but for cases where the directive just needs one or two config
options (which is common) I think supporting :
could keep things tighter. It would help avoid the case where there are more "configuration fence" lines than actual configuration options.
from myst-parser.
Related Issues (20)
- third occurence of heading with the same title cannot be referenced, `[myst.xref_missing]` HOT 1
- No longer a canonical way to parse a simple snippet HOT 4
- Emit include-read event HOT 1
- Message "inconsistent footnote references in translated message...." HOT 1
- WARNING: 'myst' cross-reference target not found: 'level-4-header-title' [myst.xref_missing] HOT 4
- Equation label of math not work before `make clean` when math_numfig=True HOT 7
- Support sphinx 7.3 - use default config value types HOT 2
- `end-before` parameter thinks it has no argument HOT 3
- More than one target found for 'myst' cross-reference [myst.xref_ambiguous] HOT 2
- $$ equation reference is not identified unless preceded by a blank line HOT 2
- no syntax to create line_block (hardbreak creates paragraph with raw linebreaks) HOT 4
- `## Heading 2` produces `<h1>Heading 2</h1>` HOT 4
- Issue on page /syntax/admonitions.html - In example, class should be space separated not comma HOT 3
- Include directive does not consider parser option HOT 2
- replacement of "." in headings for slugs HOT 1
- Test regressions with Sphinx 7.3.7 HOT 2
- Missing newline from doc HOT 5
- For v3.0.1: indented directive options no longer recognised HOT 7
- For v3.0.1: If the last directive option has an empty value then it is omitted HOT 5
- Add `:tag:` Option for Custom Equation Numbering in `{math}` Directives HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from myst-parser.