Comments (12)
I'm not sure of an ideal solution off-hand; just somehow you need to 'defer' the processing of Markdown span tokens until you've found all possible link/footnote definitions.
It's a bit of a peculiarity of Markdown, that for example, you can't just identify the references through a regex, then store them for later replacement with the definitions, because whether or not they can be immediately resolved can have a direct bearing on how the subsequent text is parsed.
For example, this:
[a*b]*
[a*b]: asd
Is parsed to this
a*b*
but if the definition is not available:
[c*d]*
[cd]
from myst-parser.
Actually I'm not sure if this is strictly part of the CommonMark spec (see this interactive demo). I'll have to see in mistletoe-ebp, if storing all potential definition reference as 'pending references' breaks any tests, because this would make things a little easier.
from myst-parser.
there are two places where a nested document is parsed:
- https://github.com/ExecutableBookProject/MyST-Parser/blob/c64f7068ba786a79561adb0017194ebc8bf15ba3/myst_parser/docutils_renderer.py#L108
- https://github.com/ExecutableBookProject/MyST-Parser/blob/c64f7068ba786a79561adb0017194ebc8bf15ba3/myst_parser/docutils_renderer.py#L838
The first fix, is that both should be called with reset_definitions=False
, to ensure we do not wipe currently collected definitions from the global parse context (that were generated from the initial parse of the source text).
To explain how this works: these references are only resolved if the definition is immediately available. To achieve this for a normal document parse in mistletoe, we run a first parse of the source text, where span tokens are not yet processed (stored as raw strings in SpanContainer
s), and link definitions are stored in a global ParseContext
class. Then a subsequent 'walk' of the syntax tree is made to process these containers and replace them with the actual syntax tokens.
An edge case here to test, would be want happens if a definition is specified in a directive above/below that of another directive, e.g.:
```note
[ref1]: link
```
```note
[ref1]
[ref2]
```
```note
[ref2]: link
```
and also how nested directives work
from myst-parser.
@choldgraf something to consider in MyST-NB: currently during the Document.read
(per cell) definitions are reset; so definitions in one text cell cannot be used by a reference in another cell. This is a different behaviour to if the notebook was e.g. converted to a text document with Jupytext and parsed as a whole.
from myst-parser.
Interesting - do you imagine it would be better to convert the ipynb
to a text file (maybe with {execute}
markdown cells?) and parsing that directly? One downside there is that then ipynb
with outputs wouldn't work.
Is there a way that we could persist information like footnotes etc across cell tokenizing?
from myst-parser.
Actually I'm not sure if this is strictly part of the CommonMark spec (see this interactive demo). I'll have to see in mistletoe-ebp, if storing all potential definition reference as 'pending references' breaks any tests, because this would make things a little easier.
Yeh definitely not possible, or at least very difficult to capture the following kind of parsing, whereby the presence of the definition massively changes the parsing flow:
[link [foo [bar]]](/uri)
[bar]: a
[link [foo [other]]](/uri)
[link [foo bar]](/uri)
from myst-parser.
so does this mean that for now we just need to tell people "if you want to use a link replacement with ipynb
files in jupyter book, you need to have the link definitions in the same cell"?
from myst-parser.
@choldgraf @chrisjsewell this is a pretty important issue I think. My feeling is the ipynb
document needs to be wholistic rather than tokenistic (based on cells). Can we pass the AST between reads of the cells
?
Given the duality between ipynb
and myst
means the entry point can be from either format so we should also consider myst
text as the entry point.
from myst-parser.
@choldgraf @chrisjsewell this is a pretty important issue I think. My feeling is the
ipynb
document needs to be wholistic rather than tokenistic (based on cells). Can we pass the AST between reads of thecells
?
This is my feeling; the thing is we need to build the full Markdown AST, before applying the renderer (i.e. Markdown AST -> docutils AST). This is mainly possible, with some restructuring of the MyST-NB parser but, as I've said before, probably precludes this notion of wrapping text cells in docutils elements, like here: https://github.com/ExecutableBookProject/MyST-NB/blob/c9478b70daeb34be30c48e807e3aa2ec5611f420/myst_nb/parser.py#L80
from myst-parser.
Shall we take this issue over to myst-nb then?
from myst-parser.
Maybe make a separate issue on there. In the first instance the is a myst-parser bug
from myst-parser.
closed in #119
from myst-parser.
Related Issues (20)
- Typo in the documentation
- Open external URLs in a new tab HOT 2
- Create a new release to capture the docutils warnings fixes HOT 2
- Cannot link to RST sections since 0.17.0 HOT 4
- 404 Page not found on "MyST - Markedly Structured Text - Parser" page HOT 1
- inline attribute on hyperlink disappears in LaTeX (but is there in HTML) HOT 1
- Line braks doesn't work. HOT 2
- third occurence of heading with the same title cannot be referenced, `[myst.xref_missing]` HOT 1
- No longer a canonical way to parse a simple snippet HOT 4
- Emit include-read event HOT 1
- Message "inconsistent footnote references in translated message...." HOT 1
- WARNING: 'myst' cross-reference target not found: 'level-4-header-title' [myst.xref_missing] HOT 4
- Equation label of math not work before `make clean` when math_numfig=True HOT 7
- Support sphinx 7.3 - use default config value types HOT 2
- `end-before` parameter thinks it has no argument HOT 3
- More than one target found for 'myst' cross-reference [myst.xref_ambiguous] HOT 2
- $$ equation reference is not identified unless preceded by a blank line HOT 2
- no syntax to create line_block (hardbreak creates paragraph with raw linebreaks) HOT 4
- `## Heading 2` produces `<h1>Heading 2</h1>` HOT 4
- Issue on page /syntax/admonitions.html - In example, class should be space separated not comma HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from myst-parser.