Comments (7)
Unfortunately, regexes won't cut it to fix this problem in full generality, e.g. to support
$$ x = \text{what is $x$?} $$
Would you be willing to review a PR that rewrites the parsing engine to count {
/}
s and check whether $
s are escaped with \
, when processing in dollars
mode?
from markdown-it-texmath.
hmm ... interesting.
Usually I am very reluctant with extending the regexes used in texmath. They are performance critical, as they are evaluated once with each user key stroke.
In fact escaped dollar sign \$
is a valid element inside latex math and interestingly it is working flawlessly inside of other non-dollar delimiters.
Dollar delimiters are special beasts. While having a short look into the math-inline
regex
/\$((?:\S)|(?:\S.*?\S))\$/
I reidentified the final \S
(last character before closing dollar) as a shortcut for [^\r\n\t\f\v ]
(character exept), which might be easily extended to [^\r\n\t\f\v \\]
presumably without too much performance cost.
After testing it in https://regex101.com/ successfully for several relevant cases, I considered it worth for also use it as a guard in the math-block
regexes.
Surprisingly a drastic simplification is helping here ... from
/\${2}([^$]+?)\${2}/
to
/\${2}(.+?)\${2}/
Expect it available in the next version.
thanks ...
from markdown-it-texmath.
Erik, thanks for your potential help.
If there are some problems with my bug fix attempt, I would like to come back to your PR offer.
from markdown-it-texmath.
/\${2}(.+?)\${2}/
Changing from [^$]
to .
excludes newlines. I think you want to allow single newlines (though you could forbid double newlines, as TeX does) within math. Instead of .
you probably want [^]
, as in:
/\${2}([^]+?)\${2}/
But none of this fixes use of \$
(escaped $
) or \text{$...$}
(nested $
) within a math expression.
One potential fix would be to use the regex as above, and then check whether the closing $
is actually escaped (had an odd number of \
s before it), or has more unescaped {
s than }
s in it, and in that case, doing more work (probably another regex match to get contents until the next \${2}
, concatenating, and trying again). This would mostly just take extra time in the weird edge cases of escaped and nested $
s, which currently don't work, so seems like a win? But it would involve counting the number of {
s and }
s (without preceding \
s) to make sure it's matched. I imagine this is all way faster than the cost of calling KaTeX, though.
Alternatively (and what I originally had in mind), the regex could match the opening \${2}
, and then do a secondary search for {
or }
or `${2}$, checking for escaping in each case, and repeat until finding the unnested closing notation. I could test to see which is faster in which cases.
from markdown-it-texmath.
In fact I had temporarily forgotten that '.
' excludes newlines. So taking your proposed '[^]
' works fine ...
inline: [
{ name: 'math_inline_double',
rex: /\${2}(.+?)\${2}/gy
},
{ name: 'math_inline',
rex: /\$((?:[^\r\n\t\f\v \\])|(?:\S.*?[^\r\n\t\f\v \\]))\$/gy
}
],
block: [
{ name: 'math_block_eqno',
rex: /\${2}([^]+?[^\\])\${2}\s*?\(([^)\s]+?)\)/gmy
},
{ name: 'math_block',
rex: /\${2}([^]+?[^\\])\${2}/gmy
}
]
... as you can see with this example code (https://github.com/goessner/markdown-it-texmath/blob/master/test/bug-dollars.html)
const str = `
# Simple Dollar tests
## Inline
here "$a+\\$ = b$" we "$\\$$" go "$\\text{\\$some...\\$}$"
## Inline block (single line only)
$$a+\\$ = \\text{\\$more...\\$} \\$$$ or ...
## Block (multiline)
$$
a+\\$ = \\text{\\$text...\\$} \\$
$$
`
To also handle unescaped dollars inside of \text{$...$}
I see effort to use relation as disproportionate. It seems to be reasonable to also escape dollars \text{\$...\$}
in that edge case.
So I would prefer to live with that small insufficiency of markdown-it-texmath for performance and simplicity reasons.
I hope I have not overlooked anything ... thanks.
from markdown-it-texmath.
Minor points:
math_inline_double
still uses.
; should probably switch that to[^]
too.- I think you can use
[^\s\\]
instead of[^\r\n\t\f\v \\]
. (The behavior is slightly different: the former treats all Unicode space identically. Probably better?) [^]+?[^\\]
(which occurs in both block rules) seems to require at least two characters. Should probably be[^]*?[^\\]
.- Shouldn't
math_inline_double
have the same addition to exclude a trailing\
? \$((?:[^\r\n\t\f\v \\])|(?:\S.*?[^\r\n\t\f\v \\]))\$
can be simplified to\$([^\r\n\t\f\v \\]|\S.*?[^\r\n\t\f\v \\])\$
or\$([^\s\\]|\S.*?[^\s\\])\$
. (I'm not quite sure why you're forbidding spaces next to the$
s but I assume that's intentional, to avoid some stray matching.)
These new rules seem to deal with \$
properly. Nice!!
Nested $
s are for re-entering math mode. \text{$x+y$}
is different from \text{\$x+y\$}
:
So it's not possible to escape these instances of $
, as \$
means something in LaTeX.
Unmatched braces generate errors in KaTeX, though. (I see either Uncaught ParseError: KaTeX parse error: Expected '}', got 'EOF' at end of input
or KaTeX parse error: Unexpected end of input in a macro argument, expected '}' at end of input: \text{
.) So perhaps that could be detected, which triggers an "extension" regex? I believe the extension regex is exactly math_inline
without the leading $
, or applying math_inline
but starting from the final $
. On input $\text{$x+y$}$
, after matching $\text{$
, you'd next grab x+y$
, fail, and then grab }$
, and then succeed. This is quadratic time in the number of $
s, but that's probably small... Alternatively, when in extension mode, we could count {
/}
s, so call KaTeX at most twice.
from markdown-it-texmath.
Sorry for the delay. Thanks for your valuable input.
* `math_inline_double` still uses `.`; should probably switch that to `[^]` too.
math_inline should be written on a single line, but I added it to be more forgiving here.
* I think you can use `[^\s\\]` instead of `[^\r\n\t\f\v \\]`. (The behavior is slightly different: the former treats all Unicode space identically. Probably better?)
This is definitely better ... taken.
* `[^]+?[^\\]` (which occurs in both block rules) seems to require at least two characters. Should probably be `[^]*?[^\\]`.
Yes ... thanks for catching.
* Shouldn't `math_inline_double` have the same addition to exclude a trailing `\`?
sure ... done.
* `\$((?:[^\r\n\t\f\v \\])|(?:\S.*?[^\r\n\t\f\v \\]))\$` can be simplified to `\$([^\r\n\t\f\v \\]|\S.*?[^\r\n\t\f\v \\])\$` or `\$([^\s\\]|\S.*?[^\s\\])\$`. (I'm not quite sure why you're forbidding spaces next to the `$`s but I assume that's intentional, to avoid some stray matching.)
yes ... again a significant improvement.
Nested
$
s are for re-entering math mode.\text{$x+y$}
is different from\text{\$x+y\$}
:
So it's not possible to escape these instances of$
, as\$
means something in LaTeX.
Now I understand that reentering math mode effect. Escaping makes no sense indeed. I still consider it a not so relevant edge case in practise. I don't want to invest that significant implementation effort at current. A PR is always welcome though.
thanks again ...
from markdown-it-texmath.
Related Issues (20)
- Can't render when latex is in <p> block HOT 2
- Need Loader to Handle file type HOT 1
- Whitespace disapears after render for inline math HOT 10
- Custom delimiter option HOT 2
- Support Doxygen formula delimiters HOT 1
- gitlab math editing HOT 1
- Support Doxygen formula delimiter syntax HOT 5
- Bracket delimiters don't work in VSCode HOT 1
- $$x^2$$ drops the 2 HOT 2
- Redundant `</math>` when there's a blank line before $$...$$ HOT 1
- Suggestion: enable more than 1 delimiter. HOT 3
- Space out immediately after the left dollar HOT 8
- Option to support multiple delimiters HOT 2
- Thanks HOT 5
- kramdown - Unable to parse correctly multiple '$$' in one line HOT 2
- using with markdonw-it-attrs
- It doesn't work when '$...$' is nested inside '$$...$$' HOT 4
- Unable to render sentence when formula is at the start of it
- When there is a line of text before the math formula, the formula will not be rendered correctly.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from markdown-it-texmath.