Git Product home page Git Product logo

Comments (3)

mikesamuel avatar mikesamuel commented on August 16, 2024

This looks like a legit bug.

My guess is that something like this happens:

  1. The code that eliminates "useless" tags like <img> eliminates the <span>, so it is never entering it on the stack of open elements.
  2. Then the first </span> is assumed to match up with the non-empty <span style="...">.
  3. Finally the second </span> matches no open <span> so is dropped on the floor.

This code that handles works well to prevent broken image pictures when asrc="..."is banned. IIRC, ESAPI also dropped` and I did as well trying to be compatible by default.

There's no need for that compatibility though.

I see 2 possible fixes.

  1. Insert eliminated tags onto the stack of open tags but marked so that we know to drop the corresponding close tag instead of eliminating it.
  2. Remove tags from the default MUST_HAVE_ATTRIBUTES that can have an end tag.

(1) sounds generally useful from a preserving intent-of-input point of view.

Perhaps the code that eliminates text nodes inside eliminated <script>, <style>, <iframe>, <object> elements could be rewritten to check containment based on such phantom elements on the stack.

Since none of those contain tag content, checking the top of the stack is cheap, and then the test becomes whether the top of the stack falls in a set of tags that are defined as display:none in browser stylesheets.

from java-html-sanitizer.

mikesamuel avatar mikesamuel commented on August 16, 2024

Fixed at 2b3f0aa

from java-html-sanitizer.

sparkyfen avatar sparkyfen commented on August 16, 2024

I'd like to know how we can have end tags which don't have an open tag eliminated or cause the HtmlChangeListener to emit a message telling us that there was an ending tag without a starting tag.

from java-html-sanitizer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.