Loofah is an HTML sanitizer. It will always fix broken markup, but can also sanitize unsafe tags in a few different ways, and transform the markup for storage or display.
It’s built on top of Nokogiri and libxml2, so it’s fast. And it uses html5lib’s whitelist, so it most likely won’t make your codes less secure. *
* These statements have not been evaluated by Netexperts.
-
Strip unsafe tags, leaving behind only the inner text.
-
Prune unsafe tags and their subtrees, removing all traces that they ever existed.
-
Escape unsafe tags and their subtrees, leaving behind lots of
<
and>
entities. -
Whitewash the markup, removing all attributes and namespaced nodes.
-
Format the markup as plain text.
-
Replacements for Rails’s
strip_tags
andsanitize
helper methods. -
TWO! Count them, TWO! ActiveRecord extensions:
-
Loofah::XssFoliate (an XssTerminate drop-in replacement) is an opt-out sanitizer; by default all models and attributes are sanitized.
-
Loofah::ActiveRecordExtension is an opt-in sanitizer; you must explicitly declare attributes to be sanitized.
-
-
99 44/100 % pure
Loofah is the only ruby XSS/sanitization library that guarantees well-formed and valid markup.
Also, it’s pretty fast. Here is a benchmark comparing Loofah to other commonly-used libraries:
For a full explanation, see the documentation for Loofah.
require 'loofah' unsafe_html = "ohai! <div>a div is safe</div> <script>but script is not</script>" Loofah.scrub_fragment(unsafe_html, :prune).to_s # => "ohai! <div>div is safe</div> "
OR
doc = Loofah.fragment(unsafe_html) # returns a Nokogiri document ... doc.scrub!(:prune) # ... with one extra method doc.to_s # => "ohai! <div>div is safe</div> " doc.text # => "ohai! div is safe "
See Loofah::ActiveRecordExtension for more documentation. The methods mixed into ActiveRecord are:
-
Loofah::ActiveRecordExtension.html_document
-
Loofah::ActiveRecordExtension.html_fragment
Example:
# config/environment.rb Rails::Initializer.run do |config| config.gem 'loofah' end # db/schema.rb create_table "posts" do |t| t.string "title" t.text "body" t.string "author" end # app/model/post.rb class Post < ActiveRecord::Base html_fragment :body, :scrub => :prune # scrubs 'body' in a before_validation end
See Loofah::XssFoliate::ClassMethods for more documentation. The methods mixed into ActiveRecord are:
-
Loofah::XssFoliate::ClassMethods.xss_foliate
-
Loofah::XssFoliate::ClassMethods.xss_foliated?
If the constant LOOFAH_XSS_FOLIATE_ALL_MODELS is set, then all models inheriting from ActiveRecord::Base will sanitize their string and text attributes in a before_validate. Otherwise, the xss_foliate method is available for opting in to sanitization.
Example:
# config/environment LOOFAH_XSS_FOLIATE_ALL_MODELS = true Rails::Initializer.run do |config| config.gem "loofah" end # db/schema.rb create_table "posts" do |t| t.string "title" t.text "body" t.string "author" end # app/model/post.rb class Post < ActiveRecord::Base # by default, 'title', 'body' and 'author' will all be sanitized in a before_validation, # without additional declarations. end
OR
# app/model/post.rb class Post < ActiveRecord::Base # opt-out and/or modify the sanitization method used xss_foliate :except => [:title, :author], :prune => :body end
-
Nokogiri >= 1.3.3
-
ruby 1.8.6, 1.8.7 or 1.9
-
Rails 2.3, 2.2, 2.1, 2.0 or 1.2 (for ActiveRecord extensions)
Unsurprisingly:
-
gem install loofah
The bug tracker is available here:
For now, we’re piggybacking on the Nokogiri mailing list:
And the IRC channel is #nokogiri on freenode.
-
Nokogiri: nokogiri.org
-
libxml2: xmlsoft.org
-
html5lib: code.google.com/p/html5lib
-
XssTerminate: github.com/look/xss_terminate/tree/master
Featuring code contributed by:
-
Aaron Patterson
-
John Barnette
-
Josh Owens
-
Paul Dix
-
Josh Nichols
And a big shout-out to Corey Innis for the name, and feedback on the API.
This library was formerly known as Dryopteris, which was a very bad name that nobody could spell properly.
The MIT License
Copyright © 2009 Mike Dalessio, Bryan Helmkamp
See MIT-LICENSE.txt in this directory.