Git Product home page Git Product logo

blind_index's Introduction

Blind Index

Securely search encrypted database fields

Designed for use with attr_encrypted

Here’s a full example of how to use it

Check out this post for more info on securing sensitive data with Rails

Build Status

How It Works

We use this approach by Scott Arciszewski. To summarize, we compute a keyed hash of the sensitive data and store it in a column. To query, we apply the keyed hash function to the value we’re searching and then perform a database search. This results in performant queries for exact matches. LIKE queries are not possible, but you can index expressions.

Leakage

An important consideration in searchable encryption is leakage, which is information an attacker can gain. Blind indexing leaks that rows have the same value. If you use this for a field like last name, an attacker can use frequency analysis to predict the values. In an active attack where an attacker can control the input values, they can learn which other values in the database match.

Here’s a great article on leakage in searchable encryption. Blind indexing has the same leakage as deterministic encryption.

Installation

Add this line to your application’s Gemfile:

gem 'blind_index'

Getting Started

Note: Your model should already be set up with attr_encrypted. The examples are for a User model with attr_encrypted :email. See the full example if needed.

Create a migration to add a column for the blind index

add_column :users, :encrypted_email_bidx, :string
add_index :users, :encrypted_email_bidx

Next, generate a key

SecureRandom.hex(32)

Store the key with your other secrets. This is typically Rails credentials or an environment variable (dotenv is great for this). Be sure to use different keys in development and production, and be sure this is different than the key you use for encryption. Keys don’t need to be hex-encoded, but it’s often easier to store them this way.

Here’s a key you can use in development

EMAIL_BLIND_INDEX_KEY=ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

Add to your model

class User < ApplicationRecord
  blind_index :email, key: [ENV["EMAIL_BLIND_INDEX_KEY"]].pack("H*")
end

pack is used to decode the hex value

Backfill existing records

User.find_each do |user|
  user.compute_email_bidx
  user.save!
end

And query away

User.where(email: "[email protected]")

Validations

To prevent duplicates, use:

class User < ApplicationRecord
  validates :email, uniqueness: true
end

We also recommend adding a unique index to the blind index column through a database migration.

Expressions

You can apply expressions to attributes before indexing and searching. This gives you the the ability to perform case-insensitive searches and more.

class User < ApplicationRecord
  blind_index :email, expression: ->(v) { v.downcase } ...
end

Multiple Indexes

You may want multiple blind indexes for an attribute. To do this, add another column:

add_column :users, :encrypted_email_ci_bidx, :string
add_index :users, :encrypted_email_ci_bidx

Update your model

class User < ApplicationRecord
  blind_index :email, ...
  blind_index :email_ci, attribute: :email, expression: ->(v) { v.downcase } ...
end

Backfill existing records

User.find_each do |user|
  user.compute_email_ci_bidx
  user.save!
end

And query away

User.where(email_ci: "[email protected]")

Index Only

If you don’t need to store the original value (for instance, when just checking duplicates), use a virtual attribute:

class User < ApplicationRecord
  attribute :email
  blind_index :email, ...
end

Requires ActiveRecord 5.1+

Multiple Columns

You can also use virtual attributes to index data from multiple columns:

class User < ApplicationRecord
  attribute :initials

  # must come before the blind_index method so it runs first
  before_validation :set_initials, if: -> { changes.key?(:first_name) || changes.key?(:last_name) }

  blind_index :initials, ...

  def set_initials
    self.initials = "#{first_name[0]}#{last_name[0]}"
  end
end

Requires ActiveRecord 5.1+

Algorithms

PBKDF2-SHA256

The default hashing algorithm. Key stretching increases the amount of time required to compute hashes, which slows down brute-force attacks.

The default number of iterations is 10,000. For highly sensitive fields, set this to at least 100,000.

class User < ApplicationRecord
  blind_index :email, iterations: 100000, ...
end

Changing this requires you to recompute the blind index.

Argon2

Argon2 is the state-of-the-art algorithm and recommended for best security.

To use it, add argon2 to your Gemfile and set:

class User < ApplicationRecord
  blind_index :email, algorithm: :argon2, ...
end

The default cost parameters are {t: 3, m: 12}. For highly sensitive fields, set this to at least {t: 4, m: 15}.

class User < ApplicationRecord
  blind_index :email, algorithm: :argon2, cost: {t: 4, m: 15}, ...
end

Changing this requires you to recompute the blind index.

The variant used is Argon2i.

Other

scrypt is also supported. Unless you have specific reasons to use it, go with Argon2 instead.

Key Rotation

To rotate keys without downtime, add a new column:

add_column :users, :encrypted_email_v2_bidx, :string
add_index :users, :encrypted_email_v2_bidx

And add to your model

class User < ApplicationRecord
  blind_index :email, key: [ENV["EMAIL_BLIND_INDEX_KEY"]].pack("H*")
  blind_index :email_v2, attribute: :email, key: [ENV["EMAIL_V2_BLIND_INDEX_KEY"]].pack("H*")
end

Backfill the data

User.find_each do |user|
  user.compute_email_v2_bidx
  user.save!
end

Then update your model

class User < ApplicationRecord
  blind_index :email, bidx_attribute: :encrypted_email_v2_bidx, key: [ENV["EMAIL_V2_BLIND_INDEX_KEY"]].pack("H*")

  # remove this line after dropping column
  self.ignored_columns = ["encrypted_email_bidx"]
end

Finally, drop the old column.

Fixtures

You can use encrypted attributes and blind indexes in fixtures with:

test_user:
  encrypted_email: <%= User.encrypt_email("[email protected]", iv: Base64.decode64("0000000000000000")) %>
  encrypted_email_iv: "0000000000000000"
  encrypted_email_bidx: <%= User.compute_email_bidx("[email protected]").inspect %>

Be sure to include the inspect at the end, or it won’t be encoded properly in YAML.

Reference

By default, blind indexes are encoded in Base64. Set a different encoding with:

class User < ApplicationRecord
  blind_index :email, encode: ->(v) { [v].pack("H*") }
end

By default, blind indexes are 32 bytes. Set a smaller size with:

class User < ApplicationRecord
  blind_index :email, size: 16
end

Alternatives

One alternative to blind indexing is to use a deterministic encryption scheme, like AES-SIV. In this approach, the encrypted data will be the same for matches.

Upgrading

0.3.0

This version introduces a breaking change to enforce secure key generation. An error is thrown if your blind index key isn’t both binary and 32 bytes.

We recommend rotating your key if it doesn’t meet this criteria. You can generate a new key in the Rails console with:

SecureRandom.hex(32)

Update your model to convert the hex key to binary.

class User < ApplicationRecord
  blind_index :email, key: [ENV["EMAIL_BLIND_INDEX_KEY"]].pack("H*")
end

And recompute the blind index.

User.find_each do |user|
  user.compute_email_bidx
  user.save!
end

To continue without rotating, set:

class User < ApplicationRecord
  blind_index :email, insecure_key: true, ...
end

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development and testing:

git clone https://github.com/ankane/blind_index.git
cd blind_index
bundle install
rake test

blind_index's People

Contributors

ankane avatar atul9 avatar ikataitsev avatar subvertallchris avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.