Comments (6)
Bump to revisit.
from java-html-sanitizer.
I would love an abbreviation for this, too.
Could you please help me out getting this to work:
private static final PolicyFactory htmlSanitizer = new HtmlPolicyBuilder()
.allowUrlProtocols("data", "https", "http", "mailto")
.allowAttributes("src")
.matching(Pattern.compile("^(data:image/(gif|png|jpeg)[,;]|http|https|mailto|//)", Pattern.CASE_INSENSITIVE))
.onElements("img")
.toFactory()
.and(Sanitizers.IMAGES)
.and(Sanitizers.BLOCKS);
public static void main(String[] args) {
System.out.println(HtmlSanitize("<img src=\"data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o\" /><p>test</p>"));
}
I get <p>test</p>
here, but the embedded image is missing. Changing src="data
to src="http
however works. What am I doing wrong?
from java-html-sanitizer.
Got it to work:
private static final PolicyFactory htmlImageSanitizer = new HtmlPolicyBuilder()
.allowUrlProtocols("data", "http", "https")
.allowElements("img")
.allowAttributes("src")
.matching(Pattern.compile("^(data:image/(gif|png|jpeg)[,;]|http|https|mailto|//).+", Pattern.CASE_INSENSITIVE))
.onElements("img")
.toFactory();
private static final PolicyFactory htmlSanitizer = htmlImageSanitizer.and(Sanitizers.BLOCKS).and(Sanitizers.FORMATTING).and(Sanitizers.LINKS).and(Sanitizers.STYLES).and(Sanitizers.TABLES);
from java-html-sanitizer.
Your solution seems complicated but the API doesn't obviously allow for a better way, so it's probably the API's fault.
There seem to be some separable concerns here:
- It's hard to restrict data URLs to particular mime-types.
- It's hard to restrict URLs to some attributes and not others.
I think the first problem is a symptom of a larger problem: it's hard to match URLs.
If we had a way to specify a concisely specify a set of URLs, then we could solve the second problem via an API like
allowAttributes("src")
.matchingUrls(...)
.onElements("img")
where the ... encapsulates (http, https, or data with content-type in image/(gif|png|jpeg)).
What do you think of https://gist.github.com/mikesamuel/e9720a0acc0601372deba3bf0896f33a as a proposed API for solving the larger problem?
Note to self: I'm finding excuses to write specifications, so I should probably figure out what work I'm subconsciously avoiding and do it.
from java-html-sanitizer.
Your API would be great at this point and definitely its implementation would be worth the effort.
from java-html-sanitizer.
https://github.com/OWASP/url-classifier is an experimental URL classifier API based on that gist.
#126 integrates it into java-html-sanitizer.
Neither is ready for prime-time yet, but you can play around.
from java-html-sanitizer.
Related Issues (20)
- independent attribute auto add value
- Behaviour with malformed HTML Input
- How to customize the policy after defining the policy.
- noopener noreferrer getting added every time even if "noopener noreferrer" already exist HOT 4
- org.springframework.web.multipart.support.MissingServletRequestPartException: Required request part 'issueModel' is not present HOT 1
- <span> elements get removed even when allowed by the policy HOT 2
- bug: closing tag for </html> misplaced HOT 1
- Vulnerable dependency guava:30.1.jre HOT 3
- Licensing issue: BSD-3-Clause or BSD-2-Clause? HOT 1
- Sanitizer converting font names in 'style' attribute value to lower case
- CSS property `overflow-wrap` not included in CssSchema definition list
- xxx-large font-size is discarded when allowStyling() is used HOT 6
- Issue while disallowing attributes matching pattern
- Remove malicious code from svg content HOT 1
- Encoding malicious code instead of removing it HOT 4
- Index out of bound when empty list is passed to `allowAttributes(...).globally()`
- Guava removal breaks compatibility (with JDK9) HOT 13
- Html sanitizer repeatedly adds rel="noopener noreferrer" even if it's pre-exist HOT 1
- SECURITY.MD currently does not contain sensible information
- Sanitizing CSS HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-html-sanitizer.