Comments (9)
in the HtmlSanitizer text gets decoded. this should not be done when sanitizing or be optional:
in the sanitize method:
case TEXT:
balancer.text(
Encoding.decodeHtml(html.substring(token.start, token.end)));
from java-html-sanitizer.
Duplicate of issue 30.
from java-html-sanitizer.
If you're going to do string comparison, instead of doing
String inputString = ...;
String sanitizedString = HTMLSanitizer.sanitize(inputString, myPolicy);
if (inputString.equals(sanitizedString)) {
// Assume no tags or attributes rejected
} else {
// Assume some tags or attributes rejected
}
why not do
String inputString = ...;
String normalizedString = HTMLSanitizer.sanitize(inputString, policyThatAllowsEverything);
String sanitizedString = HTMLSanitizer.sanitize(inputString, myPolicy);
if (normalizedString.equals(sanitizedString)) {
// Assume no tags or attributes rejected
} else {
// Assume some tags or attributes rejected
}
?
That should make your equality test independent of any changes in the way text nodes are represented.
from java-html-sanitizer.
thanks that is a smart work around for the problem of giving feedback. the spaces and   i could find and replace after the sanitization.
i talked to our pen testers and they don't see the need for the text encoding as long as the tags are removed. i also spoke to our db-er and he was very unhappy with the encoding as then he would need to expend all the fields 5 times to accommodate the longer texts.
so i think i still will remove the encoding
from java-html-sanitizer.
for others: apart from the work around from mikesamuel you can alse use the HtmlChangeListener to track removed tags like this:
public class MyHtmlChangeListener implements HtmlChangeListener<List<String>> {
public void discardedTag(List<String> context, String elementName) {
if (context != null) {
context.add(elementName);
}
}
public void discardedAttributes(List<String> context, String tagName, String... attributeNames) {
for (String attributeName : attributeNames) {
if (context != null) {
context.add(attributeName);
}
}
}
}
from java-html-sanitizer.
so i think i still will remove the encoding
Fair enough. When you fork, I'd appreciate if you'd change the package from org.owasp
to avoid confusion with the OWASP endorsed version.
I also won't be held responsible for informing forks of emerging threats, so you'll have to track those yourself.
from java-html-sanitizer.
yes thanks. i have to check if i'm allowed to add my code into the public domain.
from java-html-sanitizer.
yes thanks. i have to check if i'm allowed to add my code into the public domain.
I think "public domain" has a specific legal meaning. This project is not public domain; it has been released under the Apache 2 license and section 4 specifies obligations related to redistribution of modifications.
from java-html-sanitizer.
mmm yes you are right... maybe i'm not allowed to change it than. i will let the project manager think about that 😄
from java-html-sanitizer.
Related Issues (20)
- xxx-large font-size is discarded when allowStyling() is used HOT 6
- Issue while disallowing attributes matching pattern
- Remove malicious code from svg content HOT 1
- Encoding malicious code instead of removing it HOT 4
- Index out of bound when empty list is passed to `allowAttributes(...).globally()`
- Guava removal breaks compatibility (with JDK9) HOT 13
- Html sanitizer repeatedly adds rel="noopener noreferrer" even if it's pre-exist HOT 1
- SECURITY.MD currently does not contain sensible information
- Sanitizing CSS HOT 3
- ClassNotFoundException: org.owasp.shim.Java8Shim after update to 20240325.1 HOT 5
- Release 20240325 cannot be transpiled HOT 1
- Issue in 2024x version with styles
- Question: What means Recognize foreign content syntactic context: mathml / svg?
- Issues encountered while processing <a> tags
- rel attributes are reordered in 20220608.1
- Possible to enforce having mutliple attributes on tag?
- On Java8Shim class, better to catch Throwable instead of Error
- text-align literals are outdated
- Please build the Java8/10 shim classes into the sanitizer JAR
- Issue with HTML Sanitization: Improper Handling of <div> Tag Inside <table>
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-html-sanitizer.