Git Product home page Git Product logo

tokenizer's Introduction

Tokenizer

Tokenizer is an HTTP proxy that injects third party authentication credentials into requests. Clients encrypt third party secrets using the proxy's public key. When the client wants to send a request to the third party service, it does so via the proxy, sending along the encrypted secret in the Proxy-Tokenizer header. The proxy decrypts the secret and injects it into the client's request. To ensure that encrypted secrets can only be used by authorized clients, the encrypted data also includes instructions on authenticating the client.

Here's an example secret that the client encrypts using the proxy's public key:

secret = {
    inject_processor: {
        token: "my-stripe-api-token"
    },
    bearer_auth: {
        digest: Digest::SHA256.base64digest('trustno1')
    }
}

seal_key = ENV["TOKENIZER_PUBLIC_KEY"]
sealed_secret = RbNaCl::Boxes::Sealed.new(seal_key).box(secret.to_json)

The client configures their HTTP library to use the tokenizer service as it's HTTP proxy:

conn = Faraday.new(
    proxy: "http://tokenizer.flycast",
    headers: {
        proxy_tokenizer: Base64.encode64(sealed_secret),
        proxy_authorization: "Bearer trustno1"
    }
)

conn.get("http://api.stripe.com")

The request will get rewritten to look like this:

GET / HTTP/1.1
Host: api.stripe.com
Authorization: Bearer my-stripe-api-token

Notice that the client's request is to http://api.stripe.com. In order for the proxy to be able to inject credentials into requests we need to speak plain HTTP to the proxy server, not HTTPS. The proxy transparently switches to HTTPS for connections to upstream services. This assumes communication between the client and tokenizer happens over a secure transport (a VPN).

Processors

The processor dictates how the encrypted secret gets turned into a credential and added to the request. The example above uses inject_processor, which simply injects the verbatim secret into a request header. By default, this injects the secret into the Authorization: Bearer header without further processing. The inject_processor can optionally specify a destination and/or printf-style format string to be applied to the injection of the credential:

secret = {
    inject_processor: {
        token: "my-stripe-api-token",
        dst:   "X-Stripe-Token",
        fmt:   "token=%s",
    },
    bearer_auth: {
        digest: Digest::SHA256.base64digest('trustno1')
    }
}

This will result in the header getting injected like this:

X-Stripe-Token: token=my-stripe-api-key

Aside from inject_processor, we also have inject_hmac_processor. This creates an HMAC signatures using the key stored in the encrypted secret and injects that into a request header. The hash algorithm can be specified in the secret under the key hash and defaults to SHA256. This processor signs the verbatim request body by default, but can sign custom messages specified in the msg parameter in the Proxy-Tokenizer header (see about parameters bellow). This processor also respects the dst and fmt options.

secret = {
    inject_hmac_processor: {
        key: "my signing key",
        hash: "sha256"
    },
    bearer_auth: {
        digest: Digest::SHA256.base64digest('trustno1')
    }
}

Request-time parameters

If the destination/formatting might vary between requests, inject_processor and inject_hmac_processor can specify an allowlist of dst/fmt parameters that the client can specify at request time. These parameters are supplied as JSON in the Proxy-Tokenizer header after the encrypted secret.

secret = {
    inject_processor: {
        token: "my-stripe-api-token"
        allowed_dst: ["X-Stripe-Token", "Authorization"],
        allowed_fmt: ["Bearer %s", "token=%s"],
    },
    bearer_auth: {
        digest: Digest::SHA256.base64digest('trustno1')
    }
}

seal_key = ENV["TOKENIZER_PUBLIC_KEY"]
sealed_secret = RbNaCl::Boxes::Sealed.new(seal_key).box(secret.to_json)

processor_params = {
    dst: "X-Stripe-Token",
    fmt: "token=%s"
}

conn.headers[:proxy_tokenizer] = "#{Base64.encode64(sealed_secret)}; #{processor_params.to_json}"

conn.get("http://api.stripe.com")

Host allowlist

If a client is fully compromised, the attacker could send encrypted secrets via tokenizer to a service that simply echoes back the request. This way, the attacker could learn the plaintext value of the secret. To mitigate against this, secrets can specify which hosts they may be used against.

secret = {
    inject_processor: {
        token: "my-stripe-api-token"
    },
    bearer_auth: {
        digest: Digest::SHA256.base64digest('trustno1')
    },
    allowed_hosts: ["api.stripe.com"],
    # or
    # allowed_host_pattern: ".*\.stripe\.com$"
}

Production deployment — fly.io

Assuming you have flyctl installed, start by cloning this repository

git clone https://github.com/superfly/tokenizer
cd ./tokenizer

create a fly.io app:

fly app create
export FLY_APP="<name of app>"

generate a private (open) key:

OPEN_KEY=$(openssl rand -hex 32)
fly secrets set --stage OPEN_KEY=$OPEN_KEY

Deploy the app without making it available on the internet1:

fly deploy --no-public-ips

Tokenizer is now deployed and accessible to other apps in your org at <name of app>.flycast. The deploy logs will contain the public (seal) key, which can be used for encrypting secrets.

1Assigning a public IP address to the app is not recommended, since it will happily proxy traffic to private IP addresses. If you require a public deployment, consider running tokenizer in a separate, dedicated organization or using it in conjuction with smokescreen.

Production deployment — custom

Tokenizer is totally stateless, so it's simple to deploy anywhere.

Assuming you have Golang installed, you can build and install tokenizer in /usr/local/bin by running

GOBIN=/usr/local/bin go install github.com/superfly/tokenizer/cmd/tokenizer@latest

Generate a private (open) key:

export OPEN_KEY=$(openssl rand -hex 32)

Run the tokenizer server:

tokenizer

The output will contain the public (seal) key, which can be used for encrypting secrets.

Test deployment

See the READMEs in github.com/superfly/tokenizer/cmd/tokenizer and github.com/superfly/tokenizer/cmd/curl for instructions on running/testing tokenizer locally.

Configuration

Tokenizer is configured with the following environment variables:

  • OPEN_KEY - The hex encoded 32 byte private key is used for decrypting secrets.
  • LISTEN_ADDRESS - The address (ip:port) to listen on.
  • FILTERED_HEADERS - A comma separated list of request headers to strip from client requests.
  • OPEN_PROXY - Setting 1 or true will allow requests that don't contain sealed secrets to be proxied. Such requests are blocked by default.

tokenizer's People

Contributors

btoews avatar alichay avatar

Stargazers

 avatar Manolis Sfendourakis avatar glory  avatar Tommy McCormick avatar Kevin Elliott avatar  avatar  avatar Thomas Harr avatar Isagani Mendoza avatar Thomas Desrosiers avatar Satyam Maurya avatar  avatar dunxen avatar  avatar  avatar Yoichi Kawasaki avatar Cristian Oliveira avatar MB avatar  avatar Justin Johnson avatar Danny Patterson avatar Garret Buell avatar  avatar Jaryl Sim avatar  avatar Gabriel Mazetto avatar Joe Kirwin avatar  avatar Hari Teja avatar  avatar Andrin Meier avatar Miles Zimmerman avatar James Greenhill avatar Simon avatar Stefan Knoblich avatar Thomas Piccirello avatar  avatar Keegan McCallum avatar  avatar Jesper Lundgren avatar Paul Jones avatar Blessing Pariola avatar Mahesh Narayanamurthi avatar Matthew Mannucci avatar Sergey Grankin avatar Andrew avatar Lar Van Der Jagt avatar Marco Marassi avatar JoshTheNerd avatar Patrick Smith avatar Ragnor Comerford avatar sudotty avatar witt avatar Dominik Wagenknecht avatar  avatar  avatar Tom Hummel avatar Sam Jakos avatar Andrey Marchenko avatar Vincent Rischmann avatar  avatar Stefan Schlesinger avatar George Miroshnykov avatar  avatar Ethan Li avatar Douglas Stephen avatar Andrei Surugiu avatar Yann avatar Adrian Wyssmann avatar  avatar Marty McGuire avatar madhankumar avatar Enrico Schaaf avatar  avatar Craig Ingram avatar Petros K. avatar Daniel González Lopes avatar Chinmay Pai avatar Andrew Benton avatar Adam Zell avatar Sandalots avatar  avatar Allan Calix avatar Jonathan Matthews avatar Matt Copperwaite avatar Jesse Brown avatar Max avatar Ryan Johnson avatar Ethan Clatterbaugh avatar Gabriel Lacroix avatar Ofek Lev avatar Karsten Dambekalns avatar Merlijn Vos avatar Vadim Demedes avatar Simon Hayward avatar  avatar Or Elimelech avatar Ankesh Bharti avatar Imad  avatar afrizaloky avatar

Watchers

Jesse Kriss avatar Thomas <b>H. Ptacek" avatar  avatar  avatar

tokenizer's Issues

Idea: Encrypt incoming data by default

Feel free to close this, I just thought this could be a fun project for anyone interested in playing around with Tokenizer.

Goal: Ensure your server cannot hold any sensitive data by automatically encrypting incoming fields in requests (eg credit card data).

How:

  • Inbound: Encrypt with Tokenizer public key using Cloudflare Rules and/or Workers (or any other provider that you can place in front of all your traffic)
  • Your backend just gets the tokenized value substituted inside the request body
  • Outbound: Send data though Tokenizer, values encrypted by default will be replaced in the request to your destination (eg Stripe)

Most of the work here seems unrelated to this project: configuring encryption and substitutions in the provider of your choice. I think this project is interesting bc it makes Tokenizer a replacement for a service like VGS https://www.verygoodsecurity.com/platform.

Any ideas on how to improve this or what to watch out for is someone wants to implement this?

LICENSE

It seems from the blog post like your intent is that folks be able to use Tokenizer themselves outside of Fly.

Would you consider adding a license to the repo to make this permission explicit?

(same goes for ssokenizer)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.