Git Product home page Git Product logo

Comments (12)

chrisjsewell avatar chrisjsewell commented on May 24, 2024 1

I'm not sure of an ideal solution off-hand; just somehow you need to 'defer' the processing of Markdown span tokens until you've found all possible link/footnote definitions.

It's a bit of a peculiarity of Markdown, that for example, you can't just identify the references through a regex, then store them for later replacement with the definitions, because whether or not they can be immediately resolved can have a direct bearing on how the subsequent text is parsed.

For example, this:

[a*b]*

[a*b]: asd

Is parsed to this

a*b*

but if the definition is not available:

[c*d]*

[cd]

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024 1

Actually I'm not sure if this is strictly part of the CommonMark spec (see this interactive demo). I'll have to see in mistletoe-ebp, if storing all potential definition reference as 'pending references' breaks any tests, because this would make things a little easier.

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

there are two places where a nested document is parsed:

The first fix, is that both should be called with reset_definitions=False, to ensure we do not wipe currently collected definitions from the global parse context (that were generated from the initial parse of the source text).

To explain how this works: these references are only resolved if the definition is immediately available. To achieve this for a normal document parse in mistletoe, we run a first parse of the source text, where span tokens are not yet processed (stored as raw strings in SpanContainers), and link definitions are stored in a global ParseContext class. Then a subsequent 'walk' of the syntax tree is made to process these containers and replace them with the actual syntax tokens.

An edge case here to test, would be want happens if a definition is specified in a directive above/below that of another directive, e.g.:

```note
[ref1]: link
```

```note
[ref1]
[ref2]
```

```note
[ref2]: link
```

and also how nested directives work

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

@choldgraf something to consider in MyST-NB: currently during the Document.read (per cell) definitions are reset; so definitions in one text cell cannot be used by a reference in another cell. This is a different behaviour to if the notebook was e.g. converted to a text document with Jupytext and parsed as a whole.

https://github.com/ExecutableBookProject/MyST-NB/blob/c9478b70daeb34be30c48e807e3aa2ec5611f420/myst_nb/parser.py#L78

from myst-parser.

choldgraf avatar choldgraf commented on May 24, 2024

Interesting - do you imagine it would be better to convert the ipynb to a text file (maybe with {execute} markdown cells?) and parsing that directly? One downside there is that then ipynb with outputs wouldn't work.

Is there a way that we could persist information like footnotes etc across cell tokenizing?

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

Actually I'm not sure if this is strictly part of the CommonMark spec (see this interactive demo). I'll have to see in mistletoe-ebp, if storing all potential definition reference as 'pending references' breaks any tests, because this would make things a little easier.

Yeh definitely not possible, or at least very difficult to capture the following kind of parsing, whereby the presence of the definition massively changes the parsing flow:

[link [foo [bar]]](/uri)

[bar]: a

[link [foo [other]]](/uri)

[link [foo bar]](/uri)

link [foo [other]]

from myst-parser.

choldgraf avatar choldgraf commented on May 24, 2024

so does this mean that for now we just need to tell people "if you want to use a link replacement with ipynb files in jupyter book, you need to have the link definitions in the same cell"?

from myst-parser.

mmcky avatar mmcky commented on May 24, 2024

@choldgraf @chrisjsewell this is a pretty important issue I think. My feeling is the ipynb document needs to be wholistic rather than tokenistic (based on cells). Can we pass the AST between reads of the cells?

Given the duality between ipynb and myst means the entry point can be from either format so we should also consider myst text as the entry point.

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

@choldgraf @chrisjsewell this is a pretty important issue I think. My feeling is the ipynb document needs to be wholistic rather than tokenistic (based on cells). Can we pass the AST between reads of the cells?

This is my feeling; the thing is we need to build the full Markdown AST, before applying the renderer (i.e. Markdown AST -> docutils AST). This is mainly possible, with some restructuring of the MyST-NB parser but, as I've said before, probably precludes this notion of wrapping text cells in docutils elements, like here: https://github.com/ExecutableBookProject/MyST-NB/blob/c9478b70daeb34be30c48e807e3aa2ec5611f420/myst_nb/parser.py#L80

from myst-parser.

choldgraf avatar choldgraf commented on May 24, 2024

Shall we take this issue over to myst-nb then?

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

Maybe make a separate issue on there. In the first instance the is a myst-parser bug

from myst-parser.

chrisjsewell avatar chrisjsewell commented on May 24, 2024

closed in #119

from myst-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.