Comments (16)
srsly? sigh. can you send me a test case?
from loofah.
We have discovered it's a problem concerning libxml2, which escapes certain characters into HTML entities. So, when using xss_foliate, we're seeing things like & and " being escaped in the database as & and ", which does not seem to be the desired behavior.
See: http://svn.gnome.org/viewvc/libxml2/trunk/entities.c?revision=3774&view=markup
Line 760
Here are several test cases and their failing output:
it "should not scrub double quotes into html entities" do
answer = new_answer(:text => "\"something\"")
answer.valid?
answer.text.should == "\"something\""
end
it "should not scrub ampersands into html entities" do
answer = new_answer(:text => "& Something")
answer.valid?
answer.text.should == "& Something"
end
it "should not scrub \\r html entities" do
answer = new_answer(:text => "\r Something")
answer.valid?
answer.text.should == "Another \r Something"
end
it "should not scrub \\n html entities" do
answer = new_answer(:text => "\n Something")
answer.valid?
answer.text.should == "ANother \n Something"
end
'Answer XSS prevention should not scrub double quotes into html entities' FAILED
expected: ""something"",
got: ""something"" (using ==)
./spec/models/answer_spec.rb:129:
'Answer XSS prevention should not scrub ampersands into html entities' FAILED
expected: "& Something",
got: "& Something" (using ==)
./spec/models/answer_spec.rb:135:
'Answer XSS prevention should not scrub \r html entities' FAILED
expected: "Another \r Something",
got: "Something" (using ==)
./spec/models/answer_spec.rb:141:
'Answer XSS prevention should not scrub \n html entities' FAILED
expected: "ANother \n Something",
got: "Something" (using ==)
./spec/models/answer_spec.rb:147:
Adam Berlin and Ian Zabel
from loofah.
If you fix tests 3 and 4 (the strings you're testing don't match), this passes for me on libxml 2.7.5. Here's my Nokogiri::VERSION_INFO:
{"warnings"=>[], "libxml"=>{"loaded"=>"2.7.5", "binding"=>"extension", "compiled"=>"2.7.5"}, "nokogiri"=>"1.4.1"}
See this commit for the tests I'm running: http://github.com/flavorjones/loofah/commit/a9a93e7abefb5d0a704b6fe2d813fa202b639298
What version of libxml are you using?
from loofah.
Thanks. We found our problems w/ tests 3 & 4 while implementing our workaround. Here's our Nokogiri version information.
{"warnings"=>[], "libxml"=>{"loaded"=>"2.7.7", "binding"=>"extension", "compiled"=>"2.7.7"}, "nokogiri"=>"1.4.1"}
from loofah.
There's still a problem here. I'm seeing ampersands (and probably other things) being escaped into HTML entities unexpectedly.
m = Model.new(:text => "stuff & things")
m.valid?
m.test
=> "stuff & things"
I'm looking at the tests you posted above (now in the loofah-activerecord repo), it doesn't seem that xss_foliate is being called on the model in their context. When I a call to Post.xss_foliate, as follows, all of the tests in the context fail.
context "these tests should pass for libxml 2.7.5 and later" do
setup do
Post.xss_foliate # <== added this
end
should "not scrub ampersands into html entities" do
answer = new_post(:plain_text => "& Something")
answer.valid?
assert_equal "& Something", answer.plain_text
end
end
Nokogiri::VERSION_INFO:
{"warnings"=>[], "ruby"=>{"engine"=>"ruby", "version"=>"1.8.7", "description"=>"ruby 1.8.7 (2010-04-19 patchlevel 253) [i686-darwin10.6.0], MBARI 0x6770, Ruby Enterprise Edition 2010.02", "platform"=>"i686-darwin10.6.0"}, "libxml"=>{"loaded"=>"2.7.3", "binding"=>"extension", "compiled"=>"2.7.3"}, "nokogiri"=>"1.5.0"}
from loofah.
Thanks for the additional information. I will be looking into this (and the rest of the Loofah issue backlog) this week.
from loofah.
Hey everyone!
OK, after looking at this, and fixing the tests (thanks, @reidab), I have to admit that I don't understand the use case here.
You're "sanitizing" a string with the expectation of eventually rendering it into an HTML view. Bare ampersands are not valid HTML. So loofah is converting those into HTML entities, so that it's proper HTML. To me, this seems like expected behavior.
Can someone explain why this isn't the expected behavior?
from loofah.
For us, it's because we don't always have the expectation of eventually rendering it into an HTML view. Sometimes we're outputting it as HTML, but other times we're outputting the same database fields as part of an iCalendar or JSON export. We're sanitizing records using loofah before they hit our database so that we know that we have a good plain-text baseline to properly encode for any output format.
from loofah.
+1 to @reidab's UC: using loofa-activerecord to scrub entities, but not always consuming in html views. After tracing through the code Friday, though, I don't see a simple solution unless there's a second-pass after LibXML.xmlEncodeSpecialChars (via fragment.text) to unescape the desired entities. I imagine we'd have this same issue w/ "e > < and others.
from loofah.
OK, the only solution I have is that fields which are not intended for rendering in HTML contexts will be marked as such; and then those fields will have a second-pass filter applied which will unescape entities.
Objections? Concerns?
from loofah.
From the UC I have, I would want this unescaping of the ampersand & quote to be done across the board. So, having to enumerate for each model all the fields would be a pain. What would be handy is at the model level be able to specify
xss_foliate :text_no_html
or a way in the initializer to specify the default is :text_no_html instead of :text.
Thanks for looking at this.
from loofah.
+1 to @reidab and @digger69's use case. We would like to use the tag scrubbing features of Loofa but not encode HTML entities when saving, as we output the values in textareas and inputs.
We are able to turn off encoding using the :encode_special_chars option to Loofah::TextBehavior.text, but this option is not exposed to xss_foliate. Perhaps it could be passed through on the options hash?
Thanks for this awesome tool!
Nokogiri::VERSION_INFO => {"warnings"=>[], "nokogiri"=>"1.5.5", "ruby"=> {"version"=>"1.9.3", "platform"=>"x86_64-darwin12.1.0", "description" => "ruby 1.9.3p327 (2012-11-10 revision 37606) [x86_64-darwin12.1.0]", "engine"=>"ruby"}, "libxml"=>{"binding"=>"extension", "compiled"=>"2.7.8", "loaded"=>"2.7.8"}}
from loofah.
Got it. Thanks for the feedback. Will try hard to get an update out this weekend.
from loofah.
+1 on this. I was hoping to use the Opt-out method so that we are "secure by default" and remove any potential threats in all fields, but that causes ampersands and other characters to be escaped. If we need escaping, it will be done at the UI layer.
Loofah ActiveRecord looks like the perfect solution for this, except for the unexpected escaping. Setting up a configuration switch to turn off the escaping by applying an extra filter seems to be the cleanest option. Is that still on the roadmap? If I can get some guidance on the direction you're taking, I'd be glad to help as this would be very useful to us (and many others it appears from this thread).
from loofah.
Addressed this in loofah-activerecord's commit aa36bdc.
from loofah.
Fix is in loofah-activerecord 1.2.0.
from loofah.
Related Issues (20)
- How to bypass characters like less than character when sanitising the data HOT 2
- A whitespace handling change in v2.9.0 is breaking a test in our code HOT 1
- `#text` should only render HTML elements HOT 1
- explore testing with the portswigger xss cheat sheet exploits
- `#to_text` doesn't handle `<br>` elements well. HOT 4
- Adding sms to ACCEPTABLE_PROTOCOLS HOT 3
- tests fail with latest versions of dependencies HOT 1
- Loofah removes HOT 3
- HTML5 empty attributes are being scrubbed HOT 5
- CSS Scrubber is removing the builtin extended CSS color properties in `>= v2.9.0` HOT 5
- RFC: should Loofah sanitize `<style>` tag contents HOT 2
- Preserving emails that look like tags HOT 2
- loofah issue with recent CVE release HOT 2
- unclosed html tags are also being pruned off, ideal expectation is to have only closed tags pruned HOT 12
- Getting errors using Nokogiri < 1.12 HOT 11
- pass encode_special_chars to to_s HOT 1
- Whitespace Added around "/" in CSS HOT 3
- Add scrub to append `target=_blank` to all links HOT 3
- Built-in scrubbers don't escape unsafe HTML with Nokogiri > 1.15 HOT 2
- feat: encapsulate some whitespace-handling into a scrubber (or scrubbers) HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from loofah.