Git Product home page Git Product logo

Comments (16)

flavorjones avatar flavorjones commented on July 29, 2024

srsly? sigh. can you send me a test case?

from loofah.

berlin-ab avatar berlin-ab commented on July 29, 2024

We have discovered it's a problem concerning libxml2, which escapes certain characters into HTML entities. So, when using xss_foliate, we're seeing things like & and " being escaped in the database as & and ", which does not seem to be the desired behavior.

See: http://svn.gnome.org/viewvc/libxml2/trunk/entities.c?revision=3774&view=markup
Line 760

Here are several test cases and their failing output:

it "should not scrub double quotes into html entities" do
  answer = new_answer(:text => "\"something\"")
  answer.valid?
  answer.text.should == "\"something\""
end

it "should not scrub ampersands into html entities" do
  answer = new_answer(:text => "& Something")
  answer.valid?
  answer.text.should == "& Something"
end

it "should not scrub \\r html entities" do
  answer = new_answer(:text => "\r Something")
  answer.valid?
  answer.text.should == "Another \r Something"
end

it "should not scrub \\n html entities" do
  answer = new_answer(:text => "\n Something")
  answer.valid?
  answer.text.should == "ANother \n Something"
end

'Answer XSS prevention should not scrub double quotes into html entities' FAILED
expected: ""something"",
got: ""something"" (using ==)
./spec/models/answer_spec.rb:129:

'Answer XSS prevention should not scrub ampersands into html entities' FAILED
expected: "& Something",
got: "& Something" (using ==)
./spec/models/answer_spec.rb:135:

'Answer XSS prevention should not scrub \r html entities' FAILED
expected: "Another \r Something",
got: "Something" (using ==)
./spec/models/answer_spec.rb:141:

'Answer XSS prevention should not scrub \n html entities' FAILED
expected: "ANother \n Something",
got: "Something" (using ==)
./spec/models/answer_spec.rb:147:

Adam Berlin and Ian Zabel

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

If you fix tests 3 and 4 (the strings you're testing don't match), this passes for me on libxml 2.7.5. Here's my Nokogiri::VERSION_INFO:

{"warnings"=>[], "libxml"=>{"loaded"=>"2.7.5", "binding"=>"extension", "compiled"=>"2.7.5"}, "nokogiri"=>"1.4.1"}

See this commit for the tests I'm running: http://github.com/flavorjones/loofah/commit/a9a93e7abefb5d0a704b6fe2d813fa202b639298

What version of libxml are you using?

from loofah.

berlin-ab avatar berlin-ab commented on July 29, 2024

Thanks. We found our problems w/ tests 3 & 4 while implementing our workaround. Here's our Nokogiri version information.

{"warnings"=>[], "libxml"=>{"loaded"=>"2.7.7", "binding"=>"extension", "compiled"=>"2.7.7"}, "nokogiri"=>"1.4.1"}

from loofah.

reidab avatar reidab commented on July 29, 2024

There's still a problem here. I'm seeing ampersands (and probably other things) being escaped into HTML entities unexpectedly.

m = Model.new(:text => "stuff & things")
m.valid?
m.test
 => "stuff & things"

I'm looking at the tests you posted above (now in the loofah-activerecord repo), it doesn't seem that xss_foliate is being called on the model in their context. When I a call to Post.xss_foliate, as follows, all of the tests in the context fail.

context "these tests should pass for libxml 2.7.5 and later" do
  setup do
    Post.xss_foliate # <== added this
  end

  should "not scrub ampersands into html entities" do
    answer = new_post(:plain_text => "& Something")
    answer.valid?
    assert_equal "& Something", answer.plain_text
  end
end

Nokogiri::VERSION_INFO:
{"warnings"=>[], "ruby"=>{"engine"=>"ruby", "version"=>"1.8.7", "description"=>"ruby 1.8.7 (2010-04-19 patchlevel 253) [i686-darwin10.6.0], MBARI 0x6770, Ruby Enterprise Edition 2010.02", "platform"=>"i686-darwin10.6.0"}, "libxml"=>{"loaded"=>"2.7.3", "binding"=>"extension", "compiled"=>"2.7.3"}, "nokogiri"=>"1.5.0"}

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

Thanks for the additional information. I will be looking into this (and the rest of the Loofah issue backlog) this week.

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

Hey everyone!

OK, after looking at this, and fixing the tests (thanks, @reidab), I have to admit that I don't understand the use case here.

You're "sanitizing" a string with the expectation of eventually rendering it into an HTML view. Bare ampersands are not valid HTML. So loofah is converting those into HTML entities, so that it's proper HTML. To me, this seems like expected behavior.

Can someone explain why this isn't the expected behavior?

from loofah.

reidab avatar reidab commented on July 29, 2024

For us, it's because we don't always have the expectation of eventually rendering it into an HTML view. Sometimes we're outputting it as HTML, but other times we're outputting the same database fields as part of an iCalendar or JSON export. We're sanitizing records using loofah before they hit our database so that we know that we have a good plain-text baseline to properly encode for any output format.

from loofah.

marknadig avatar marknadig commented on July 29, 2024

+1 to @reidab's UC: using loofa-activerecord to scrub entities, but not always consuming in html views. After tracing through the code Friday, though, I don't see a simple solution unless there's a second-pass after LibXML.xmlEncodeSpecialChars (via fragment.text) to unescape the desired entities. I imagine we'd have this same issue w/ &quote &gt &lt and others.

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

OK, the only solution I have is that fields which are not intended for rendering in HTML contexts will be marked as such; and then those fields will have a second-pass filter applied which will unescape entities.

Objections? Concerns?

from loofah.

marknadig avatar marknadig commented on July 29, 2024

From the UC I have, I would want this unescaping of the ampersand & quote to be done across the board. So, having to enumerate for each model all the fields would be a pain. What would be handy is at the model level be able to specify
xss_foliate :text_no_html
or a way in the initializer to specify the default is :text_no_html instead of :text.

Thanks for looking at this.

from loofah.

pivotal-medici avatar pivotal-medici commented on July 29, 2024

+1 to @reidab and @digger69's use case. We would like to use the tag scrubbing features of Loofa but not encode HTML entities when saving, as we output the values in textareas and inputs.

We are able to turn off encoding using the :encode_special_chars option to Loofah::TextBehavior.text, but this option is not exposed to xss_foliate. Perhaps it could be passed through on the options hash?

Thanks for this awesome tool!

Nokogiri::VERSION_INFO => {"warnings"=>[], "nokogiri"=>"1.5.5", "ruby"=> {"version"=>"1.9.3", "platform"=>"x86_64-darwin12.1.0", "description" => "ruby 1.9.3p327 (2012-11-10 revision 37606) [x86_64-darwin12.1.0]", "engine"=>"ruby"}, "libxml"=>{"binding"=>"extension", "compiled"=>"2.7.8", "loaded"=>"2.7.8"}}

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

Got it. Thanks for the feedback. Will try hard to get an update out this weekend.

from loofah.

timherby avatar timherby commented on July 29, 2024

+1 on this. I was hoping to use the Opt-out method so that we are "secure by default" and remove any potential threats in all fields, but that causes ampersands and other characters to be escaped. If we need escaping, it will be done at the UI layer.

Loofah ActiveRecord looks like the perfect solution for this, except for the unexpected escaping. Setting up a configuration switch to turn off the escaping by applying an extra filter seems to be the cleanest option. Is that still on the roadmap? If I can get some guidance on the direction you're taking, I'd be glad to help as this would be very useful to us (and many others it appears from this thread).

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

Addressed this in loofah-activerecord's commit aa36bdc.

from loofah.

flavorjones avatar flavorjones commented on July 29, 2024

Fix is in loofah-activerecord 1.2.0.

from loofah.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.