Comments (9)
Is the image copyrighted?
Thanks for the note on the regexs. I have a rewrite of the CSS lexer mostly
ready to go which gets rid of regular expressions (and the associated unbounded
recursion and backtracking) entirely.
Original comment by [email protected]
on 16 Jul 2013 at 4:22
- Changed state: Accepted
from java-html-sanitizer.
Great!
Copyright - I don't know.
It looks like someone sent this page:
http://www.thesouthafrican.com/entertainment/south-african-couple-launches-engli
sh-bubbly-to-rival-champagne.htm
and that the image is the red "+share" icon at the top. Attaching as both
base64 (from the html) and decoded to gif.
Original comment by [email protected]
on 16 Jul 2013 at 9:56
Attachments:
from java-html-sanitizer.
Attached, Untitled.zip that includes Untitled.html, Untitled.b64, and
Untitled.gif that I made using gimp. It triggers the same error.
I hereby release this to the public domain without any warranty explicit or
implied.
Original comment by [email protected]
on 16 Jul 2013 at 10:01
Attachments:
from java-html-sanitizer.
Great. Thanks.
Original comment by [email protected]
on 16 Jul 2013 at 10:38
from java-html-sanitizer.
https://code.google.com/p/owasp-java-html-sanitizer/source/detail?r=180
replaces the CSS lexer with one that passes your test, and doesn't backtrack
but I'm not going to close this bug until that's production ready since the new
code has not been thoroughly vetted on malformed inputs.
My plan thus far is
1. Rewrite the CSS filter with a token-level filter based on Caja white-lists (
https://code.google.com/p/google-caja/wiki/CajaWhitelists ) that conservatively
identifies all URLs, and normalizes tokens. This should fix all the border
problems by being more permissive and put us in a place to allow data URLs.
2. Test the new lexer with fuzzers and white-box tests until I'm confident that
there's no inf. loops.
3. Push a release with the more liberal CSS sanitizer.
4. Look into data: URLs and plan from there.
I should be able to get 1-3 done this week or next, but feel free to play
around with trunk in the meantime but please don't roll out to production yet.
Original comment by [email protected]
on 17 Jul 2013 at 12:24
from java-html-sanitizer.
Re data: attributes, I'm unsure what to do there.
https://www.owasp.org/images/0/03/Mario_Heiderich_OWASP_Sweden_The_image_that_ca
lled_me.pdf suggests that allowing images is not ok, and I don't know whether
browsers agree on the origin of an image from a data URL.
I could whitelist
data:image/gif;base64,...
data:image/png;base64,...
data:image/jpeg;base64,...
where the first 4 characters of ... are the b64 encoding of the first 3
characters of the magic number for that image type.
That doesn't eliminate the risk of polyglots like
http://www.thinkfu.com/blog/gifjavascript-polyglots but combined with the
explicit mime-type should suffice.
Original comment by [email protected]
on 17 Jul 2013 at 10:20
from java-html-sanitizer.
Tested with trunk/r198 and can confirm that the stack overflow no longer
happens. For my application, I don't miss the image. We want to show email
reasonably faithfully, but removing attack vectors in much more important that
look and this type of inline image data is rare (it was one document in about
150,000).
Original comment by [email protected]
on 18 Jul 2013 at 7:50
from java-html-sanitizer.
Stackoverflow is fixed at r198.
Punting on support for data URLs until there is a client who really needs image
embedding via CSS.
Original comment by [email protected]
on 24 Jul 2013 at 3:47
- Changed state: Fixed
from java-html-sanitizer.
Thanks, Mike!! OK in testing. Will use in prod as soon as it shows up on maven
central.
Original comment by [email protected]
on 24 Jul 2013 at 4:20
from java-html-sanitizer.
Related Issues (20)
- Licensing issue: BSD-3-Clause or BSD-2-Clause? HOT 1
- Sanitizer converting font names in 'style' attribute value to lower case
- CSS property `overflow-wrap` not included in CssSchema definition list
- xxx-large font-size is discarded when allowStyling() is used HOT 6
- Issue while disallowing attributes matching pattern
- Remove malicious code from svg content HOT 1
- Encoding malicious code instead of removing it HOT 4
- Index out of bound when empty list is passed to `allowAttributes(...).globally()`
- Guava removal breaks compatibility (with JDK9) HOT 13
- Html sanitizer repeatedly adds rel="noopener noreferrer" even if it's pre-exist HOT 1
- SECURITY.MD currently does not contain sensible information
- Sanitizing CSS HOT 3
- ClassNotFoundException: org.owasp.shim.Java8Shim after update to 20240325.1 HOT 5
- Release 20240325 cannot be transpiled HOT 1
- Issue in 2024x version with styles
- Question: What means Recognize foreign content syntactic context: mathml / svg?
- Issues encountered while processing <a> tags
- rel attributes are reordered in 20220608.1
- Possible to enforce having mutliple attributes on tag?
- On Java8Shim class, better to catch Throwable instead of Error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-html-sanitizer.