Git Product home page Git Product logo

Comments (7)

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
How is this causing problems?

In HTML,

    <a href="ftp://site.com:user@host/file.txt">click here</a>
    <img src="[email protected]">

should be semantically equivalent to

    <a href="ftp://site.com:user@host/file.txt">click here</a>
    <img src="[email protected]">

since character references in HTML attributes are decoded before the attribute 
value is computed.

Original comment by [email protected] on 19 Jun 2013 at 10:16

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
On Jun 20, 2013, at 1:17, "[email protected]"
<[email protected]> wrote:

Hi Mike,

I am presenting emails in a browser. Some of the image references in
the HTML part are via rfc2392 cid:{url-addr-spec} to attached images
that have Content-ID: <{url-addr-spec}>. Since the browser won't
resolve those, I replace them in the content, prior to sending to the
browser, with URLs to the corresponding image in our attachment store.
(Email stored as received. On display request sanitized, then
processed for cid-reference replacement -  work-around: do
cid-replacement, then sanitize).

A fully correct implementation would parse the document as HTML,
canonicalize the img src attribute value (first as CDATA, then as URL,
then as rfc822 addr-spec), then replace it based on lookup of
canonicalized (as URL then as rfc822 addr-spec) content-ids.

My implementation uses a regexp to do the substitution. That works
with the assumption that the img src attribute url-addr-spec and
content-id are canonicalized, which in practice is virtually always is
the case.

I understand that what I'm doing is not correct, so I'm a bit
embarrassed and can't make a compelling argument. The replacement of @
with the HTML entity reference breaks the simplistic approach. If this
replacement by the sanitizer is not necessary for security, then I'd
rather have them unaltered or move towards canonical/simplified form.

I'd also be happy to understand why the sanitizer must replace @ with
@ and redo my part the right way :-).

Thanks!
Fred

Original comment by [email protected] on 20 Jun 2013 at 3:26

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
Have you tried allowUrlProtocols("cid", "mid") possibly combined with an 
AttributePolicy to do any mapping from cid: URLs to something you can serve.

For reference 
http://owasp-java-html-sanitizer.googlecode.com/svn/trunk/distrib/javadoc/org/ow
asp/html/HtmlPolicyBuilder.html#allowUrlProtocols%28java.lang.String...%29 :
> Adds to the set of protocols that are allowed in URL attributes. For each URL 
attribute that is allowed, we further constrain it by only allowing the value 
through if it specifies no protocol, or if it specifies one in the 
allowedProtocols white-list.

http://owasp-java-html-sanitizer.googlecode.com/svn/trunk/distrib/javadoc/org/ow
asp/html/AttributePolicy.html
> A policy that can be applied to an HTML attribute to decide whether or not to 
allow it in the output, possibly after transforming its value.





----

For my reference:
RFC 2392 references 822, not 2822 and there is no update that switches to 2822 
so any addr-spec normalization would have to output to the intersection of 
822/2822 which differ around white-space in domains and other places according 
to 2822/Appendix.B that might introduce IPv6 issues in domain literals.

Original comment by [email protected] on 20 Jun 2013 at 2:14

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
That looks like a way to go. Thanks!

FWIW - I've always seen rfc2822 as a more sane version of rfc822 that disallows 
some complex and [virtually] never used ways of making simple things like 
addresses complex (eg by putting whitespace and comments between atoms). I 
would treat it as rfc2822 in the rfc2392 context and accept that someone 
technically could use rfc822-legal syntax that I would reject. 

Original comment by [email protected] on 23 Jun 2013 at 6:10

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
[deleted comment]

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024
[deleted comment]

from java-html-sanitizer.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 16, 2024

Original comment by [email protected] on 28 Feb 2014 at 9:59

  • Changed state: WontFix

from java-html-sanitizer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.