rehypejs / rehype Goto Github PK

View Code? Open in Web Editor NEW

1.7K 12.0 85.0 588 KB

HTML processor powered by plugins part of the @unifiedjs collective

Home Page: https://unifiedjs.com

License: MIT License

JavaScript 100.00%

ast javascript html rehype unified

rehype's Issues

Incorrectly parsed dash-cased svg properties as camelCase

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

rehype-parse

Link to runnable example

https://codesandbox.io/s/rehype-debug-forked-llth4

Steps to reproduce

Parse svg with clip-rule via rehype-parse

<svg class='sc-gGLxEB set-color' fill='#8A8F98' height='16' stroke='none' viewBox='0 0 24 24' width='16'>
  <path clip-rule='evenodd'
        d='M6 0C2.68629 0 0 2.68629 0 6V18C0 21.3137 2.68629 24 6 24H18C21.3137 24 24 21.3137 24 18V6C24 2.68629 21.3137 0 18 0H6ZM7.54545 7H10.4545C10.7558 7 11 7.24421 11 7.54545V10.4545C11 10.7558 10.7558 11 10.4545 11H7.54545C7.24421 11 7 10.7558 7 10.4545V7.54545C7 7.24421 7.24421 7 7.54545 7ZM13.5455 7H16.4545C16.7558 7 17 7.24421 17 7.54545V10.4545C17 10.7558 16.7558 11 16.4545 11H13.5455C13.2442 11 13 10.7558 13 10.4545V7.54545C13 7.24421 13.2442 7 13.5455 7ZM10.4545 13H7.54545C7.24421 13 7 13.2442 7 13.5455V16.4545C7 16.7558 7.24421 17 7.54545 17H10.4545C10.7558 17 11 16.7558 11 16.4545V13.5455C11 13.2442 10.7558 13 10.4545 13ZM13.5455 13H16.4545C16.7558 13 17 13.2442 17 13.5455V16.4545C17 16.7558 16.7558 17 16.4545 17H13.5455C13.2442 17 13 16.7558 13 16.4545V13.5455C13 13.2442 13.2442 13 13.5455 13Z'
        fill-rule='evenodd'
        stroke='none'></path>
</svg>

Expected behavior

clip-rule should stay in properties as is

Actual behavior

Now clip-rule property stay in AST as clipRule.

Runtime

Node v16

Package manager

yarn v1

OS

macOS

Build and bundle tools

No response

rehype-parse throws an error "cannot read property of undefined(reading spaceSeparated)" when html contain unencoded markup characters in <pre><code>.

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Expected behavior

rehype succeed parsing the html content with some incorrect closing tags just like most browser do.

Actual behavior

It throws an error Cannot read properties of undefined (reading 'spaceSeparated')

Runtime

Node v14

Package manager

npm 6

OS

macOS

Build and bundle tools

Other (please specify in steps to reproduce)

Parser incorrectly reads image srcset when containing commas in image URL

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

8.0.3

Link to runnable example

No response

Steps to reproduce

The issue occurs when there is a comma included in the image URLs of the srcset field

My current flow of data:

WordPress GraphQL API via WPGraphQL
HTML is retrieved as raw HTML string
HTML is passed into unified pipeline that uses rehype https://github.com/colbyfayock/spacejelly.dev/blob/main/src/lib/parse.js#L21
result is rendered via React with dangerously set inner HTML

Example raw HTML:

<img loading=\"lazy\" width=\"2560\" height=\"1504\" src=\"https://res.cloudinary.com/colbycloud/images/w_2560,h_1504/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki.jpg?_i=AA\" alt=\"Website with grid of characters from Stranger Things\" class=\"wp-image-847\" srcset=\"https://res.cloudinary.com/colbycloud/images/w_2560,h_1504/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki.jpg?_i=AA 2560w, https://res.cloudinary.com/colbycloud/images/w_300,h_176,c_scale/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki-300x176.jpg?_i=AA 300w, https://res.cloudinary.com/colbycloud/images/w_1024,h_601,c_scale/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki-1024x601.jpg?_i=AA 1024w, https://res.cloudinary.com/colbycloud/images/w_768,h_451,c_scale/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki-768x451.jpg?_i=AA 768w, https://res.cloudinary.com/colbycloud/images/w_1536,h_902,c_scale/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki-1536x902.jpg?_i=AA 1536w, https://res.cloudinary.com/colbycloud/images/w_2048,h_1203,c_scale/f_auto,q_auto/v1636561367/nextjs-app-stranger-things-wiki/nextjs-app-stranger-things-wiki-2048x1203.jpg?_i=AA 2048w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" />

Expected behavior

Parsed srcset property should include a way to distinguish the value that relates multiple values of an image URL to a size

What the parsed values may look like if still following a similar comma delimited pattern:

[
  'https://res.cloudinary.com/colbycloud/images/w_2560,h_829/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images.jpg?_i=AA 2560w',
  'https://res.cloudinary.com/colbycloud/images/w_300,h_97,c_scale/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-300x97.jpg?_i=AA 300w',
  'https://res.cloudinary.com/colbycloud/images/w_1024,h_332,c_scale/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-1024x332.jpg?_i=AA 1024w',
  'https://res.cloudinary.com/colbycloud/images/w_768,h_249,c_scale/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-768x249.jpg?_i=AA 768w',
  'https://res.cloudinary.com/colbycloud/images/w_1536,h_498,c_scale/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-1536x498.jpg?_i=AA 1536w',
  'https://res.cloudinary.com/colbycloud/images/w_2048,h_664,c_scale/f_auto,q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-2048x664.jpg?_i=AA 2048w'
]

Actual behavior

When a comma is included in the URL of images, the srcset sees that as a delimiting character and incorrectly parses the values

Example when parsed:

[
  'https://res.cloudinary.com/colbycloud/images/w_2560',
  'h_829/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images.jpg?_i=AA 2560w',
  'https://res.cloudinary.com/colbycloud/images/w_300',
  'h_97',
  'c_scale/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-300x97.jpg?_i=AA 300w',
  'https://res.cloudinary.com/colbycloud/images/w_1024',
  'h_332',
  'c_scale/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-1024x332.jpg?_i=AA 1024w',
  'https://res.cloudinary.com/colbycloud/images/w_768',
  'h_249',
  'c_scale/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-768x249.jpg?_i=AA 768w',
  'https://res.cloudinary.com/colbycloud/images/w_1536',
  'h_498',
  'c_scale/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-1536x498.jpg?_i=AA 1536w',
  'https://res.cloudinary.com/colbycloud/images/w_2048',
  'h_664',
  'c_scale/f_auto',
  'q_auto/v1636561853/stranger-things-characters-images/stranger-things-characters-images-2048x664.jpg?_i=AA 2048w'
]

Runtime

Node v14

Package manager

yarn v1

OS

macOS

Build and bundle tools

Next.js

SVG attributes getting transformed to invalid camel case

I'm trying to use rehype to process HTML containing inline SVGs, and the SVGs are coming out broken. Attributes like stroke-linecap="round" in the input are coming out invalid as strokeLineCap="round" in the output.

Steps to reproduce

I've set up a test repo demonstrating the issue. Here's the code I'm running there. I hooked up the code to run in a GitHub Actions workflow too so you can see it happen for yourself there together with all the gory details about the environment. It seems to be environment-independent anyway as I first encountered this on my MacBook.

const rehype = require("rehype")
const processor = rehype()

const html = `
<!doctype html>
<html lang="en" dir="ltr">
  <head>
    <meta charset="utf-8" />
  </head>
  <body>
    <svg xmlns="http://www.w3.org/2000/svg" stroke-linecap="round" stroke-linejoin="round" viewBox="0 0 8 8">
      <path stroke="#fff1e8" d="M0 6V3h1l1 1v2"/>
    </svg>
  </body>
</html>

console.log(processor.processSync(html).toString())

Expected behavior

The SVG should come back out the other side processor.processSync(html).toString() and still be valid.

Actual behavior

<!doctype html><html lang="en" dir="ltr"><head>
    <meta charset="utf-8">
  </head>
  <body>
    <svg xmlns="http://www.w3.org/2000/svg" strokeLineCap="round" strokeLineJoin="round" viewBox="0 0 8 8">
      <path stroke="#fff1e8" d="M0 6V3h1l1 1v2"></path>
    </svg>


</body></html>

I'm not 100% convinced this is a bug yet, so I'm half-anticipating hearing that I've misunderstood something here. Still thought it was worth reporting though just in case!

Use native `Object.assign` instead of `xtend`

xtend can easily be replaced with Object.assign

This reduce the dependencies and possible also avoid duplicated versions where some package don't use the same version range for the xtend package

// immutable
Object.assign({}, a, b)

// mutable
Object.assign(a, b)

Fix wrong link

Documentation: Wrong link

In the Introduction: https://github.com/rehypejs/rehype

Browse awesome rehype to find out more about the ecosystem

Expected behaviour

Should point to https://github.com/rehypejs/awesome

Actual behaviour

Points to https://github.com/retextjs/awesome

Thank You!

Add Support for Tag Swapping on Rehype-Stringify

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

rehype-stringify should have a components option just like rehype-react.

This problem exists because I want to be able to convert my markdown to the limited set of HTML that Mastodon provides for their posts. I plan on implementing custom classes for formatting html tags in a way that is usable in a Mastodon post.

Solution

Add support for changing the html tags in rehype-stringify. Using the same format that rehype-react is fine

Alternatives

Convert to react first, then render results

Parsing fails with noscript tag in head

Failure to parse `noscript` correctly in `head` tag

rehype-parse fails to parse correctly when there is a noscript tag in the head. The text child is placed into the body tag, leaving an empty noscript tag in the head, and all the remaining tags are also placed into the body tag instead of the head.

My environment

OS: Ubuntu 19.10
Packages: rehype: ^10.0.0, rehype-parse: ^6.0.2,
Env: node v12.4.0, npm 6.14.4

Steps to reproduce

See this repo for a demo.

Expected behavior

Given the following input:

<html>
  <head>
    <noscript>&lt;h1&gt;Hello, world&lt;/h1&gt;</noscript>
    <style>
      body { background-color: #ccc; }
    </style>
  </head>
  <body>
    <h1>Goodbye, Earthlings!</h1>
  </body>
</html>

... and given the following code:

rehype().use(parse).process(source, (err, file) => {
  if (err) {
    console.log('error', err.message);
  } else {
    console.log(report(err || file));
    console.log(String(file));
  }
})

... I would have expected the last statement to produce (ignoring indentation and formatting):

<html><head>
    <noscript>&#x3C;h1>Hello, world&#x3C;/h1></noscript>
    <style>
      body { background-color: #ccc; }
    </style>
  </head>
  <body>
    <h1>Goodbye, Earthlings!</h1>
  </body>
</html>

Actual behavior

The following output is produced:

{
  "type": "root",
  "children": [
    {
      "type": "element",
      "tagName": "html",
      "properties": {},
      "children": [
        {
          "type": "element",
          "tagName": "head",
          "properties": {},
          "children": [
            {
              "type": "text",
              "value": "\n    ",
              "position": {
                "start": {
                  "line": 3,
                  "column": 9,
                  "offset": 16
                },
                "end": {
                  "line": 4,
                  "column": 5,
                  "offset": 21
                }
              }
            },
            {
              "type": "element",
              "tagName": "noscript",
              "properties": {},
              "children": [],
              "position": {
                "start": {
                  "line": 4,
                  "column": 5,
                  "offset": 21
                },
                "end": {
                  "line": 4,
                  "column": 15,
                  "offset": 31
                }
              }
            }
          ],
          "position": {
            "start": {
              "line": 3,
              "column": 3,
              "offset": 10
            },
            "end": {
              "line": 4,
              "column": 15,
              "offset": 31
            }
          }
        },
        {
          "type": "element",
          "tagName": "body",
          "properties": {},
          "children": [
            {
              "type": "text",
              "value": "<h1>Hello, world</h1>\n    ",
              "position": {
                "start": {
                  "line": 4,
                  "column": 15,
                  "offset": 31
                },
                "end": {
                  "line": 5,
                  "column": 5,
                  "offset": 80
                }
              }
            },
            {
              "type": "element",
              "tagName": "style",
              "properties": {},
              "children": [
                {
                  "type": "text",
                  "value": "\n      body { background-color: #ccc; }\n    ",
                  "position": {
                    "start": {
                      "line": 5,
                      "column": 12,
                      "offset": 87
                    },
                    "end": {
                      "line": 7,
                      "column": 5,
                      "offset": 131
                    }
                  }
                }
              ],
              "position": {
                "start": {
                  "line": 5,
                  "column": 5,
                  "offset": 80
                },
                "end": {
                  "line": 7,
                  "column": 13,
                  "offset": 139
                }
              }
            },
            {
              "type": "text",
              "value": "\n  \n  \n    ",
              "position": {
                "start": {
                  "line": 7,
                  "column": 13,
                  "offset": 139
                },
                "end": {
                  "line": 10,
                  "column": 5,
                  "offset": 163
                }
              }
            },
            {
              "type": "element",
              "tagName": "h1",
              "properties": {},
              "children": [
                {
                  "type": "text",
                  "value": "Goodbye, Earthlings!",
                  "position": {
                    "start": {
                      "line": 10,
                      "column": 9,
                      "offset": 167
                    },
                    "end": {
                      "line": 10,
                      "column": 29,
                      "offset": 187
                    }
                  }
                }
              ],
              "position": {
                "start": {
                  "line": 10,
                  "column": 5,
                  "offset": 163
                },
                "end": {
                  "line": 10,
                  "column": 34,
                  "offset": 192
                }
              }
            },
            {
              "type": "text",
              "value": "\n  \n\n",
              "position": {
                "start": {
                  "line": 10,
                  "column": 34,
                  "offset": 192
                },
                "end": {
                  "line": 13,
                  "column": 1,
                  "offset": 211
                }
              }
            }
          ]
        }
      ],
      "position": {
        "start": {
          "line": 2,
          "column": 1,
          "offset": 1
        },
        "end": {
          "line": 13,
          "column": 1,
          "offset": 211
        }
      }
    }
  ],
  "data": {
    "quirksMode": true
  },
  "position": {
    "start": {
      "line": 1,
      "column": 1,
      "offset": 0
    },
    "end": {
      "line": 13,
      "column": 1,
      "offset": 211
    }
  }
}

<html><head>
    <noscript></noscript></head><body>&#x3C;h1>Hello, world&#x3C;/h1>
    <style>
      body { background-color: #ccc; }
    </style>
  
  
    <h1>Goodbye, Earthlings!</h1>
  

</body></html>

Bad logo rendering on github dark theme

Subject of the issue

The logo of Rehype is not well displayed in the Readme of the repository when I use Github's dark-theme

Your environment

Firefox, Github dark theme

Steps to reproduce

Enable Github Dark Theme, and open the repository page.

Expected behavior

The logo should be fully readable

Actual behavior

rehype-stringify 10.0.0 does not work according to documentation

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

rehype-stringify

Link to runnable example

No response

Steps to reproduce

Follow the example from the docs, making sure you've installed [email protected].

Note that you get typescript failures.

Try to run the code. You'll get an error like

TypeError: Cannot `process` without `Compiler`
    at assertCompiler (file:///Users/ian/projects/com.ianwremmel/node_modules/unified/lib/index.js:520:11)
    at Function.process (file:///Users/ian/projects/com.ianwremmel/node_modules/unified/lib/index.js:377:5)
    at render (file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:474:105)
    at file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:484:52
    at Array.map (<anonymous>)
    at loader2 (file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:483:11)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Object.callRouteLoaderRR (/Users/ian/projects/com.ianwremmel/node_modules/@remix-run/server-runtime/dist/data.js:52:16)
    at callLoaderOrAction (/Users/ian/projects/com.ianwremmel/node_modules/@remix-run/router/router.ts:3778:16)
    at async Promise.all (index 0)

This issue was initially reported in #149. It's not clear to me how they updated to the latest version to fix it since the initial bug report was about the latest version.

Expected behavior

Markdown should compile

Actual behavior

TypeError: Cannot `process` without `Compiler`
    at assertCompiler (file:///Users/ian/projects/com.ianwremmel/node_modules/unified/lib/index.js:520:11)
    at Function.process (file:///Users/ian/projects/com.ianwremmel/node_modules/unified/lib/index.js:377:5)
    at render (file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:474:105)
    at file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:484:52
    at Array.map (<anonymous>)
    at loader2 (file:///Users/ian/projects/com.ianwremmel/build/index.js?t=1698196639195.3516:483:11)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Object.callRouteLoaderRR (/Users/ian/projects/com.ianwremmel/node_modules/@remix-run/server-runtime/dist/data.js:52:16)
    at callLoaderOrAction (/Users/ian/projects/com.ianwremmel/node_modules/@remix-run/router/router.ts:3778:16)
    at async Promise.all (index 0)

Runtime

Other (please specify in steps to reproduce)

Package manager

npm 8

OS

macOS

Build and bundle tools

Remix

Unexpected html codes when parsing attributes with quotes

Subject of the issue

The html codes for quotes char on data-attributes are the js ones but not the html ones.

' -> ' instead of 9 maybe because ' is \u0027 in js.
" -> " instead of 4 maybe because ' is \u0022 in js.

I created a failing test for this bug on a fresh fork of the repo. https://github.com/benabel/rehype/commit/09f20182aec9c22bb482d5d1112d1b4f728467d6

Your environment

OS: linux
Packages: rehype-stringify
Env: yarn

Steps to reproduce

Stringify this html code, or run test api on the fork: https://github.com/benabel/rehype/commit/09f20182aec9c22bb482d5d1112d1b4f728467d6

<p data-content="This the new example with a 'quotation' mark"></p>
<p data-content='This the new example with a "quotation" mark'></p>

Expected behavior

<p data-content="This the new example with a &#x39;quotation&#x39; mark"></p>
<p data-content="This the new example with a &#x34;quotation&#x34; mark"></p>

Actual behavior

<p data-content="This the new example with a &#x27;quotation&#x27; mark"></p>
<p data-content="This the new example with a &#x22;quotation&#x22; mark"></p>'

[docs]: XHTML compatibility

Subject of the feature

Problem

No information about XHTML compatibility. Only information about XML

Expected behavior

More information about XHTML compatibility

Alternatives

Unfortunately, we do not have a huge set of tests, so we cannot check, but if the official supported will be great add couple words about it.

Sorry for multiple issue. We are evaluating rehype, so it may be useful for other developers as well.

Prefer explicit options over implicit settings

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

latest

Link to runnable example

No response

Steps to reproduce

rehype/packages/rehype-parse/lib/index.js

Line 42 in 4f64443

const settings = Object.assign({}, options, processorSettings)

Expected behavior

Object.assign({}, processorSettings, options)

Actual behavior

Object.assign({}, options, processorSettings)

Runtime

Node v16

Package manager

No response

OS

No response

Build and bundle tools

No response

[docs]: Performance

Subject of the feature

Will be great to look at Performance compassion with other HTML parsers like here https://github.com/fb55/htmlparser2#performance

Problem

Expected behavior

There are not problems. I think a table like this would help increase popularity. And let the developers get some metrics.

Alternatives

I think we can use https://github.com/AndreasMadsen/htmlparser-benchmark to get results.

Why?

We want to integrate rehype in webpack and webpack ecosystem to handle HTML/HTML entrypoints. We evaluate existing solutions and their convenience. The project looks very good and has everything we need (API).

rehype-parse: parse error in xml cdata with raw closing angle bracket

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

[email protected]

Link to runnable example

No response

Steps to reproduce

git clone https://github.com/milahu/docbook2md
cd docbook2md
git checkout  077fbf159f16e8781336a955ef0269ac9499c39e
./run.sh

main script: src/docbook2md.ts

input file: examples/attrsets.xml

relevant section:

<programlisting><![CDATA[
let set = { a = { b = 3; }; };
in lib.attrsets.attrByPath [ "a" "b" ] 0 set
=> 3
]]></programlisting>

Expected behavior

correctly parse xml cdata

{
  programlisting: {
    type: "element",
    tagName: "programlisting",
    children: [
      { type: "text", value: '\nlet set = { a = { b = 3; }; };\nin lib.attrsets.attrByPath [ "a" "b" ] 0 set\n=> 3' }
    ]
}

Actual behavior

the parser confuses the > in cdata with the end of cdata

{
  programlisting: {
    type: "element",
    tagName: "programlisting",
    children: [
      {
        type: "comment",
        value: '[CDATA[\nlet set = { a = { b = 3; }; };\nin lib.attrsets.attrByPath [ "a" "b" ] 0 set\n=',
        position: [Object]
      },
      { type: "text", value: "3 ]]>" }
    ]
  }
}

Runtime

Deno

Package manager

No response

OS

No response

Build and bundle tools

No response

Parsing as document or fragment

Currently, there’s no way to add elements for optional opening tags. This should of course be available.
I’d say the default to be document mode.

rehype parse generates an additional <p></p> for html content

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

[email protected]

Link to runnable example

https://stackblitz.com/edit/github-frqzag?file=src%2Findex.ts

Steps to reproduce

Paste following html to rehype parse:

<div class="page-body">
  <p id="13d0fd81-b7f7-47ca-ae21-5b96a36c5f23" class="">aaa
  <div class="indented">
    <p id="22a66b31-b267-4924-b0c6-cf08772184b6" class="">bbb
    <div class="indented">
      <p id="d07a767a-8b5b-4d07-bb20-7657e23bf3a0" class="">ccc</p>
    </div>
    </p>
    <p id="97cd6489-e270-41bb-833c-6b55cb4b2bf4" class="">ddd</p>
  </div>
  </p>
  <p id="ee10b0a1-ee8b-4f8a-9e1d-2b1c5f09006f" class="">eee</p>
</div>

Expected behavior

<div class="page-body">
  <p id="13d0fd81-b7f7-47ca-ae21-5b96a36c5f23" class="">aaa
  <div class="indented">
    <p id="22a66b31-b267-4924-b0c6-cf08772184b6" class="">bbb
    <div class="indented">
      <p id="d07a767a-8b5b-4d07-bb20-7657e23bf3a0" class="">ccc</p>
    </div>
    </p>
    <p id="97cd6489-e270-41bb-833c-6b55cb4b2bf4" class="">ddd</p>
  </div>
  </p>
  <p id="ee10b0a1-ee8b-4f8a-9e1d-2b1c5f09006f" class="">eee</p>
</div>

Actual behavior

<p id="13d0fd81-b7f7-47ca-ae21-5b96a36c5f23" class="">aaa
  </p><div class="indented">
    <p id="22a66b31-b267-4924-b0c6-cf08772184b6" class="">bbb
    </p><div class="indented">
      <p id="d07a767a-8b5b-4d07-bb20-7657e23bf3a0" class="">ccc</p>
    </div>
+   <p></p>
    <p id="97cd6489-e270-41bb-833c-6b55cb4b2bf4" class="">ddd</p>
  </div>
+  <p></p>
  <p id="ee10b0a1-ee8b-4f8a-9e1d-2b1c5f09006f" class="">eee</p>

Runtime

Node v16

Package manager

pnpm

OS

Linux

Build and bundle tools

Vite

[BUG] unexpected parsing behaviour for the same html tag

Subject of the issue

When parsing the html into syntax tree, the same html tag with different properties produces conflit syntax nodes.

Your environment

OS:
win10 20h2
Packages:
rehype-parse
Env:
node 13, npm 6.12.0

Steps to reproduce

<card type="block" name="hr"></card>
<card type="block" name="localdoc"
    value="data:%7B%22status%22%3A%22done%22%2C%22source%22%3A%22transfer%22%2C%22src%22%3A%22https%3A%2F%2Fwww.yuque.com%2Fattachments%2Fyuque%2F0%2F2021%2Fpdf%2F2596791%2F1615361339259-26318a71-30c9-4f4f-ad67-d384b0b5c8af.pdf%22%2C%22name%22%3A%22Vue.js%E5%89%8D%E7%AB%AF%E5%BC%80%E5%8F%91%E5%9F%BA%E7%A1%80%E4%B8%8E%E9%A1%B9%E7%9B%AE%E5%AE%9E%E6%88%98%20-%20%E9%83%91%E9%9F%A9%E4%BA%AC(2020).pdf%22%2C%22ext%22%3A%22pdf%22%2C%22size%22%3A8758881%2C%22collapsed%22%3Atrue%2C%22margin%22%3Atrue%2C%22id%22%3A%22Uuryf%22%7D">
</card>

just parse the html content above and see its syntax tree, find out its difference.

Expected behavior

The output should be same with different properties (since they only differ from its properties)

What should happen?

But they produce different syntax trees.

Actual behavior

What happens instead?

see screenshot and find out its difference

Republish v9 series to include type updates

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

9.0.3

Link to runnable example

No response

Steps to reproduce

Build with TypeScript 5 with Node16 module resolution.
Along side hast-util-to-html version 8.0.4.

The published typings point to internals of hast-util-to-html which are no longer accessible.
https://www.npmjs.com/package/rehype-stringify/v/9.0.3?activeTab=code

/** @type {import('unified').Plugin<[Options?]|Array<void>, Node, string>} */
export default function rehypeStringify(
  config: void | import('hast-util-to-html/lib/types').Options | undefined
): void
export type Root = import('hast').Root
export type Node = Root | Root['children'][number]
export type Options = import('hast-util-to-html').Options

Expected behavior

No error and published package should point to the exported option

Actual behavior

node_modules/rehype-stringify/lib/index.d.ts:3:25 - error TS2307: Cannot find module 'hast-util-to-html/lib/types' or its corresponding type declarations.

3   config: void | import('hast-util-to-html/lib/types').Options | undefined

Runtime

Node v16

Package manager

npm 8

OS

Linux

Build and bundle tools

Vite

Single quotes in style attributes are turned into html entities - css lint error

Subject of the issue

Stringify is turning my single quotes into ' and makes my css linter complain.

Your environment

OS: Ubuntu 20.04
Packages:

❯ yarn list --pattern "unified|rehype-parse|to-vfile|rehype-stringify|fs-extra"
yarn list v1.22.5
├─ @types/[email protected]
├─ [email protected]
├─ [email protected]
│ └─ [email protected]
├─ [email protected]
│ └─ [email protected]
├─ [email protected]
├─ [email protected]
├─ [email protected]
│ └─ [email protected]
├─ [email protected]
│ └─ [email protected]
├─ [email protected]
└─ [email protected]
Done in 0.49s.

Env: node 14.5.0, yarn 1.22.5

Steps to reproduce

full minimal reproduction:

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <h1 style="font-family: 'Some-font', sans-serif;">
</body>
</html>

index.js

const unified = require("unified");
const parser = require("rehype-parse");
const toVfile = require("to-vfile");
const stringify = require("rehype-stringify");
const fs = require("fs-extra");

const fileIn = "./index.html";
const fileOut = "./index-parsed.html";

unified()
  .use(parser)
  .use(stringify)
  .process(toVfile.readSync(fileIn), (err, data) => {
    if (err) {
      throw new Error(err);
    }
    fs.writeFileSync(fileOut, String(data));
  });

run with node index.js

Result

index-parsed.html

<!doctype html><html lang="en"><head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <h1 style="font-family: &#x27;Some-font&#x27;, sans-serif;">

</h1></body></html>

css lint error "property value expected at [7, 29]"

Expected behavior

The quotes should remain as is in the og file.

Actual behavior

A css lint error happens.

Namespaces

Related to syntax-tree/hast#6.

keeping original entities

Subject of the feature

Looks like it is impossible to keep original input of entities. I found it here https://github.com/rehypejs/rehype/tree/main/test/fixtures/entities.

Problem

When I stringify my HTML I don't want to change entities content.

Expected behavior

Keep them as they were written without changes.

Alternatives

Provide option to keep them as is.

How’s this different from PostHTML?

Subj. See https://github.com/posthtml/posthtml.

DELETE

Sorry, wrong tab of the browser

Trailing whitespace in element is lost

Subject of the issue

Trailing whitespace in element is lost if followed by text node.

a <strong>b </strong>c

Your environment

OS: macOS
Packages: rehype
Env: Node 12.16.3

Steps to reproduce

I tried to create a test, but I am not confident that it is 100% correct, please double check
#36

Expected behavior

The white space should be inside the strong element

Actual behavior

There is no white space

"rehype is not an XML parser"

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

readme says "rehype is not an XML parser"
but does not help me to find an XML parser for unified

rehype/packages/rehype-parse/readme.md

Lines 137 to 140 in 620bb9c

 > 👉 **Note**: rehype is not an XML parser. 

 > It supports SVG as embedded in HTML. 

 > It does not support the features available in XML. 

 > Passing SVG files might break but fragments of modern SVG should be fine.

Solution

suggest an XML parser for unified

something based on xast-util-from-xml

Alternatives

rehype-parse works for simple XML files

but it fails to parse <![CDATA[ ... ]]>

example: nixpkgs/doc/functions/library/attrsets.xml (docbook xml format) (NixOS/nixpkgs#105243)

  <example xml:id="function-library-lib.attrset.attrByPath-example-value-exists">
   <title>Extracting a value from a nested attribute set</title>
<programlisting><![CDATA[
let set = { a = { b = 3; }; };
in lib.attrsets.attrByPath [ "a" "b" ] 0 set
=> 3
]]></programlisting>
  </example>

Workaround

const inputText = (
  readFileSync(inputPath, 'utf8')
  // workaround for parsing xml
  // https://github.com/rehypejs/rehype/issues/109
  //.replace(/<!\[CDATA\[(.*?)\]\]>/sg, '$1')
  .replace(/<!\[CDATA\[(.*?)\]\]>/sg, '<cdata>$1</cdata>')
);

rehype-stringify 10.0.0 does not compile

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

rehype-stringify v10.0.0

Link to runnable example

No response (FYI stackblitz would not run your MWE template for me)

Steps to reproduce

Using the following snippet

import {unified} from 'unified'
import rehypeParse from 'rehype-parse'
import rehypeStringify from 'rehype-stringify'

const content = await unified()
    .use(rehypeParse)
    .use(rehypeStringify)
    .process('<h1>Hello World</h1>');

Expected behavior

The code should run and compile the HTML to a string.

Actual behavior

The code has the following runtime error:
Error [TypeError]: Cannot 'process' without 'Compiler'
and does not compile anything.

Downgrading the package to 9.0.4 fixes the issue.

Runtime

Node.js v20.5.1

Package manager

NPM 9.8.1

OS

Linux

Build and bundle tools

Next.js

Introduce type definitions for rehype

To do list

Please check items when they are resolved.

Unexpected list element hoisting

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

[email protected]

Link to runnable example

https://codesandbox.io/p/devbox/loving-borg-9p5c5q?file=%2Fsrc%2Findex.ts%3A11%2C7

Steps to reproduce

When using <li> elements inside of tags that are not <ul> the nested children get hoisted out. In the codepen I have used the custom tag <custom_list> but the same behavior occurs with standard <div> tags.

Input:

<custom_list>
  <li>
    <p>Text</p>
  </li>
  <li>
    <custom_list>
      <li><p>Nested Text</p></li>
    </custom_list>
  </li>
</custom_list>

Output:

<custom_list>
  <li>
    <p>Text</p>
  </li>
  <li>
    <custom_list>
    </custom_list>
  </li>
  <li><p>Nested Text</p></li>
</custom_list>

Expected behavior

The nested list item should be maintained as child of element inside the <li> tag

Actual behavior

The child of the element inside the <li> tag is hoisted up 1 level.

Runtime

Other (please specify in steps to reproduce)

Package manager

pnpm

OS

Linux

Build and bundle tools

Vite

	> 👉 Note: rehype is not an XML parser.
	> It supports SVG as embedded in HTML.
	> It does not support the features available in XML.
	> Passing SVG files might break but fragments of modern SVG should be fine.

rehypejs / rehype Goto Github PK

rehype's Issues

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

Steps to reproduce

Expected behavior

Actual behavior

Documentation: Wrong link

Expected behaviour

Actual behaviour

Initial checklist

Problem

Solution

Alternatives

Failure to parse noscript correctly in head tag

My environment

Steps to reproduce

Expected behavior

Actual behavior

Subject of the issue

Your environment

Steps to reproduce

Expected behavior

Actual behavior

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

Subject of the issue

Your environment

Steps to reproduce

Expected behavior

Actual behavior

Subject of the feature

Problem

Expected behavior

Alternatives

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Failure to parse `noscript` correctly in `head` tag