Git Product home page Git Product logo

loofah's Introduction

Loofah

DESCRIPTION

Loofah is an HTML sanitizer. It will always fix broken markup, but can also sanitize unsafe tags in a few different ways, and transform the markup for storage or display.

It’s built on top of Nokogiri and libxml2, so it’s fast. And it uses html5lib’s whitelist, so it most likely won’t make your codes less secure. *

* These statements have not been evaluated by Netexperts.

FEATURES

  • Strip unsafe tags, leaving behind only the inner text.

  • Prune unsafe tags and their subtrees, removing all traces that they ever existed.

  • Escape unsafe tags and their subtrees, leaving behind lots of < and > entities.

  • Whitewash the markup, removing all attributes and namespaced nodes.

  • Format the markup as plain text.

  • Replacements for Rails’s strip_tags and sanitize helper methods.

  • TWO! Count them, TWO! ActiveRecord extensions:

    • Loofah::XssFoliate (an XssTerminate drop-in replacement) is an opt-out sanitizer; by default all models and attributes are sanitized.

    • Loofah::ActiveRecordExtension is an opt-in sanitizer; you must explicitly declare attributes to be sanitized.

  • 99 44/100 % pure

COMPARE AND CONTRAST

Loofah is the only ruby XSS/sanitization library that guarantees well-formed and valid markup.

Also, it’s pretty fast. Here is a benchmark comparing Loofah to other commonly-used libraries:

SYNOPSIS

For a full explanation, see the documentation for Loofah.

require 'loofah'

unsafe_html = "ohai! <div>a div is safe</div> <script>but script is not</script>"

Loofah.scrub_fragment(unsafe_html, :prune).to_s  # => "ohai! <div>div is safe</div> "

OR

doc = Loofah.fragment(unsafe_html)  # returns a Nokogiri document ...
doc.scrub!(:prune)                  # ... with one extra method
doc.to_s                            # => "ohai! <div>div is safe</div> "
doc.text                            # => "ohai! div is safe "

ACTIVERECORD EXTENSION #1: OPT-IN

See Loofah::ActiveRecordExtension for more documentation. The methods mixed into ActiveRecord are:

  • Loofah::ActiveRecordExtension.html_document

  • Loofah::ActiveRecordExtension.html_fragment

Example:

# config/environment.rb
Rails::Initializer.run do |config|
  config.gem 'loofah'
end

# db/schema.rb
create_table "posts" do |t|
  t.string  "title"
  t.text    "body"
  t.string  "author"
end

# app/model/post.rb
class Post < ActiveRecord::Base
  html_fragment :body, :scrub => :prune  # scrubs 'body' in a before_validation
end

ACTIVERECORD EXTENSION #2: OPT-OUT (XSS_TERMINATE DROP-IN REPLACEMENT)

See Loofah::XssFoliate::ClassMethods for more documentation. The methods mixed into ActiveRecord are:

  • Loofah::XssFoliate::ClassMethods.xss_foliate

  • Loofah::XssFoliate::ClassMethods.xss_foliated?

If the constant LOOFAH_XSS_FOLIATE_ALL_MODELS is set, then all models inheriting from ActiveRecord::Base will sanitize their string and text attributes in a before_validate. Otherwise, the xss_foliate method is available for opting in to sanitization.

Example:

# config/environment
LOOFAH_XSS_FOLIATE_ALL_MODELS = true
Rails::Initializer.run do |config|
  config.gem "loofah"
end

# db/schema.rb
create_table "posts" do |t|
  t.string  "title"
  t.text    "body"
  t.string  "author"
end

# app/model/post.rb
class Post < ActiveRecord::Base
  # by default, 'title', 'body' and 'author' will all be sanitized in a before_validation,
  # without additional declarations.
end

OR

# app/model/post.rb
class Post < ActiveRecord::Base
  # opt-out and/or modify the sanitization method used
  xss_foliate :except => [:title, :author], :prune => :body
end

REQUIREMENTS

  • Nokogiri >= 1.3.3

  • ruby 1.8.6, 1.8.7 or 1.9

  • Rails 2.3, 2.2, 2.1, 2.0 or 1.2 (for ActiveRecord extensions)

INSTALLATION

Unsurprisingly:

  • gem install loofah

SUPPORT

The bug tracker is available here:

For now, we’re piggybacking on the Nokogiri mailing list:

And the IRC channel is #nokogiri on freenode.

AUTHORS

Featuring code contributed by:

  • Aaron Patterson

  • John Barnette

  • Josh Owens

  • Paul Dix

  • Josh Nichols

And a big shout-out to Corey Innis for the name, and feedback on the API.

HISTORICAL NOTE

This library was formerly known as Dryopteris, which was a very bad name that nobody could spell properly.

LICENSE

The MIT License

Copyright © 2009 Mike Dalessio, Bryan Helmkamp

See MIT-LICENSE.txt in this directory.

loofah's People

Contributors

brynary avatar flavorjones avatar jbarnette avatar pauldix avatar queso avatar technicalpickles avatar tenderlove avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.