Git Product home page Git Product logo

Comments (8)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
This is the expected output, but obviously, if the font tag is shuttled into 
some non-row content abyss, then it could be improved.

The design decision was made that way because CSS is such a large attack 
surface that anything we can do to limit attackers' ability to manipulate CSS 
leaves us in a more secure position.

Are there tags besides HTML where this is a problem?

If the styling handler detected that it was inside a table, thead, tbody, 
tfoot, rowgroup, colgroup, or tr element, and instead inserted the <font> 
element inside contained <td> and <th> elements would that help?

So instead of

<table>
<font face="Arial, Geneva, sans-serif" style="color:#000">
<tbody>
<tr>
<th>Column One</th>
<th>Column Two</th>
</tr>
<tr>
<td align="center"><font style="background-color:#fffffe"><font size="2">Size 
2</font></font></td>
<td align="center"><font style="background-color:#fffffe"><font size="7">Size 
7</font></font></td>
</tr>
</tbody>
</font>
</table>

you would get

<table style="color:#000">
<tbody>
<tr>
<th><font face="Arial, Geneva, sans-serif">Column One</th>
<th><font face="Arial, Geneva, sans-serif">Column Two</th>
</tr>
<tr>
<td align="center"><font style="background-color:#fffffe" face="Arial, Geneva, 
sans-serif" size="2">Size 2</font></td>
<td align="center"><font style="background-color:#fffffe" face="Arial, Geneva, 
sans-serif" size="7">Size 7</font></td>
</tr>
</tbody>
</table>

Original comment by [email protected] on 2 Feb 2013 at 6:25

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
White-listing a set of fonts from 
http://webdesign.about.com/od/fonts/qt/web-safe-fonts.htm could keep the size 
of the sanitized output much closer to the size of the input for large tables, 
but I'm loathe to do anything that makes some fonts work with tables and others 
not.

Original comment by [email protected] on 2 Feb 2013 at 11:23

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
I'm actually not that much of an HTML expert; this is just the problem I saw in 
my first test. I assume there will be other constructs that are similarly 
problematic, but I'm not in a position to enumerate them.

I still think you could preserve some style attributes without increasing the 
attack surface. Currently when you see 
style="color: rgb(0, 0, 0); font-family: Arial, Geneva, sans-serif;"
you parse that into some data structures that are subsequently used in a font 
tag. I'm suggesting that instead of emitting the font tag, you could remove the 
user-supplied style attributes, and replace it with a new style attribute that 
you generate from known constructs. So in my example, you would get:
<table style="color:#000; font-family: Arial, Geneva, sans-serif;">

I'm concerned that any other approach is likely to impact rendering in ways 
that aren't obvious to end users. It might not be a big deal if someone is 
typing markup into a wiki, but if your app allows someone to paste in a chunk 
of HTML (copied from a web page, or perhaps even a word processing 
application), then users will expect the rendering to look as similar to the 
original as possible.

Original comment by [email protected] on 4 Feb 2013 at 8:08

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
> if your app allows someone to paste in a chunk of HTML (copied from a web 
page, or perhaps even a word processing application), then users will expect 
the rendering to look as similar to the original as possible.

agreed

> instead of emitting the font tag, you could remove the user-supplied style 
attributes, and replace it with a new style attribute that you generate from 
known constructs

I do something like that already, just not for font names.
http://code.google.com/p/owasp-java-html-sanitizer/source/browse/trunk/src/tests
/org/owasp/html/StylingPolicyTest.java

http://code.google.com/p/owasp-java-html-sanitizer/source/browse/trunk/src/main/
org/owasp/html/StylingPolicy.java#320
Only font family, align, and style are put on the <font> tag.  The latter two 
are easy to whitelist.

http://www.w3.org/TR/CSS21/fonts.html#value-def-family-name says
> Font family names must either be given quoted as strings, or unquoted as a 
sequence of one or more identifiers. This means most punctuation characters and 
digits at the start of each token must be escaped in unquoted font family names.

I'll see if I can come up with a white-list of generic font names (e.g. 
sans-serif), and then any non-generic font name that contains only ASCII alpha 
numerics and spaces gets quoted and put in a CSS style tag.  Anything with 
punctuation like the examples below from the CSS spec I'll either reject or 
maybe shove in a <font face>.

    font-family: Ahem!, sans-serif;
    font-family: test@foo, sans-serif;
    font-family: #POUND, sans-serif;
    font-family: Hawaii 5-0, sans-serif;

I'll test whether vendor prefixed  ones like -webkit-small-control survive 
quoting.  Allowing untrusted code to spoof OS controls might enable trusted 
path violation anyway.

Original comment by [email protected] on 5 Feb 2013 at 10:37

  • Changed state: Accepted

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
http://code.google.com/p/owasp-java-html-sanitizer/source/detail?r=147 fixes 
this issue.  Let me know if that works for you and I'll cut a push to maven 
central.

Original comment by [email protected] on 12 Feb 2013 at 7:14

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
Thanks Mike, this does in fact fix the specific problem I reported.

(I'm still concerned that the fidelity of the transformed HTML will be 
insufficient unless a lot more CSS constructs are supported, though.)

Original comment by [email protected] on 2 Apr 2013 at 10:26

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
Great.  I'll make sure this is on maven current.

Re CSS, my current (full-time) project involves generating sanitizers (and 
other tools) from grammars annotated with schema constraints so hopefully 
https://code.google.com/p/noinject/source/browse/mlsrc/test-files/san/css/gramma
r.g will soon serve as the basis for a more flexible way to sanitize CSS.  That 
grammar is very drafty and written against an obsolete version of the CSS3 
spec, but the general shape will probably remain the same.

Original comment by [email protected] on 5 Apr 2013 at 4:26

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 17, 2024
r198 includes a significant rewrite of the CSS sanitizer which recognizes a 
larger set of CSS properties and no longer introduces <font> elements so should 
work well with tables.

Original comment by [email protected] on 24 Jul 2013 at 3:55

  • Changed state: Fixed

from java-html-sanitizer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.