authzed / spicedb Goto Github PK

Open Source, Google Zanzibar-inspired permissions database to enable fine-grained authorization for customer applications

Home Page: https://authzed.com/docs

License: Apache License 2.0

Dockerfile 0.04% Go 99.42% HTML 0.10% JavaScript 0.44%

zanzibar permissions database scale latency production distributed-systems grpc security security-tools

spicedb's Introduction

SpiceDB sets the standard for authorization that scales.

Scale with
Traffic • Dev Velocity • Functionality • Geography

What is SpiceDB?

SpiceDB is a graph database purpose-built for storing and evaluating access control data.

As of 2021, broken access control became the #1 threat to the web. With SpiceDB, developers finally have the solution to stopping this threat the same way as the hyperscalers.

Why SpiceDB?

World-class engineering: painstakingly built by experts that pioneered the cloud-native ecosystem
Authentic design: mature and feature-complete implementation of Google's Zanzibar paper
Proven in production: 5ms p95 when scaled to millions of queries/s, billions of relationships
Global consistency: consistency configured per-request unlocks correctness while maintaining performance
Multi-paradigm: caveated relationships combine the best concepts in authorization: ABAC & ReBAC
Safety in tooling: designs schemas with real-time validation or validate in your CI/CD workflow
Reverse Indexes: queries for "What can subject do?", "Who can access resource?"

Joining the Community

SpiceDB is a community project where everyone is invited to participate and feel welcomed. While the project has a technical goal, participation is not restricted to those with code contributions.

Learn

Ask questions via GitHub Discussions or our Community Discord
Read blog posts from the Authzed team describing the project and major announcements
Watch our YouTube videos about SpiceDB, modeling schemas, leveraging CNCF projects, and more
Explore the SpiceDB Awesome List that enumerates official and third-party projects built by the community
Reference community examples for demo environments, integration testing, CI pipelines, and writing schemas

Contribute

CONTRIBUTING.md documents communication, contribution flow, legal requirements, and common tasks when contributing to the project.

You can find issues by priority: Urgent, High, Medium, Low, Maybe. There are also good first issues.

Our documentation website is also open source if you'd like to clarify anything you find confusing.

Getting Started

Installing the binary

Binary releases are available for Linux, macOS, and Windows on AMD64 and ARM64 architectures.

Homebrew users for both macOS and Linux can install the latest binary releases of SpiceDB and zed using the official tap:

brew install authzed/tap/spicedb authzed/tap/zed

Debian-based Linux users can install SpiceDB packages by adding a new APT source:

sudo apt update && sudo apt install -y curl ca-certificates gpg
curl https://pkg.authzed.com/apt/gpg.key | sudo apt-key add -
sudo echo "deb https://pkg.authzed.com/apt/ * *" > /etc/apt/sources.list.d/fury.list
sudo apt update && sudo apt install -y spicedb zed

RPM-based Linux users can install SpiceDB packages by adding a new YUM repository:

sudo cat << EOF >> /etc/yum.repos.d/Authzed-Fury.repo
[authzed-fury]
name=AuthZed Fury Repository
baseurl=https://pkg.authzed.com/yum/
enabled=1
gpgcheck=0
EOF
sudo dnf install -y spicedb zed

Running a container

Container images are available for AMD64 and ARM64 architectures on the following registries:

Docker users can run the latest SpiceDB container with the following:

docker run --rm -p 50051:50051 authzed/spicedb serve --grpc-preshared-key "somerandomkeyhere"

SpiceDB containers use Chainguard Images to ship the bare minimum userspace which is a huge boon to security, but can complicate debugging. If you want to execute a user session into a running SpiceDB container and install packages, you can use one of our debug images.

Appending -debug to any tag will provide you an image that has a userspace with debug tooling:

docker run --rm -ti --entrypoint sh authzed/spicedb:latest-debug

Containers are also available for each git commit to the main branch under ${REGISTRY}/authzed/spicedb-git:${COMMIT}.

Deploying to Kubernetes

Production Kubernetes users should be relying on a stable release of the SpiceDB Operator. The Operator enforces not only best practices, but orchestrates SpiceDB updates without downtime.

If you're only experimenting, feel free to try out one of our community-maintained examples for testing SpiceDB on Kubernetes:

kubectl apply -f https://raw.githubusercontent.com/authzed/examples/main/kubernetes/example.yaml

Developing your own schema

You can try both SpiceDB and zed entirely in your browser in the hosted Playground thanks to the power of WebAssembly. The Playground app is open source and can also be self-hosted.

If you don't want to start with the examples loadable from the Playground, you can follow a guide for developing a schema or review the the schema language design documentation.

Watch the SpiceDB primer video to get started with schema development:

Trying out the API

For debugging or getting started, we recommend installing zed, the official command-line client. The Playground also has a tab for experimenting with zed all from within your browser.

When it's time to write code, we recommend using one of the existing client libraries whether it's official or community-maintained.

Because every millisecond counts, we recommend using libraries that leverage the gRPC API for production workloads.

To get an understanding of integrating an application with SpiceDB, you can follow the Protecting Your First App guide or review API documentation on the Buf Registry or Postman.

Acknowledgements

SpiceDB is a community project fueled by contributions from both organizations and individuals. We appreciate all contributions, large and small, and would like to thank all those involved.

In addition, we'd like to highlight a few notable contributions:

The GitHub Authorization Team for implementing and contributing the MySQL datastore
The Netflix Authorization Team for sponsoring and being a design partner for caveats
The Equinix Metal Team for sponsoring our benchmarking hardware

spicedb's People

Contributors

Stargazers

Watchers

Forkers

jzelinskie josephschorr alecmerdler sir-rob ecordell cxz mterron pithomlabs isgasho avineshwar lukw00heck ruthraiahthulasi bussiere arcaneakshay elevatedesigner geeknation tophydra eternalerrors costap hercules261188 vivanvatsa nickufer kurtfurbush uedamiki vvjgondo forkkit org-mars apiauth hackthings suryatmodulus spector-in-london rsbh dgozalo ryankask gogmod 0xflotus andy9775 guangminglion gcallaghan alexf4dev yuchen-sun showsmall daniel-007 alessandromr jon-whit cqlmq jonwhitty mu-l samkim buraksekili cjbarker murali-chevuri snehesht zhigui-projects zhangaz1 linuer remymiguk revbluedigital fatelei hellblazer jakedt kevinz itomsawyer cjs rock59 sbryant mfrank2016 polymath-is chriskirkland syllogy renovate-bot jayrao91 siqyan ecasilla watsonso zeina1i rajesharista laplacekorea kiarashazarnia mongorpc vroldanbet ssjtt nbarbey safe-health yibit boyvinall suyambuganesh82 gergof pinghe abdelhai vedantr jeeyi ukanwat fearlessfe cognizance-processing inzzot oviva-ag lmt-swallow wemod123 orijtech

spicedb's Issues

Deduplicate dispatching within a short window

When the same read-only requests occur within a configurable window of time, a single dispatch can be performed to fetch their results.

A useful library for implementing this is the golang.org/x/sync/singleflight.

Leopard indexing system

The Zanzibar paper devotes a lot of time to the leopard cache, which caches member to group and group to group relationships:

Recursive pointer chasing during check evaluation has difficulty maintaining low latency with groups that are deeply nested or have a large number of child groups. For selected namespaces that exhibit such structure, Zanzibar handles checks using Leopard, a specialized index that supports efficient set computation.

The Leopard system consists of three discrete parts: a serving system capable of consistent and low-latency operations across sets; an offline, periodic index building system; and an online real-time layer capable of continuously updating the serving system as tuple changes occur.

Document Zookies

A sufficient document on Zookie would cover the following subjects:

How they are different from a typical read snapshot token
How they are used in Zanzibar
How they are used in v0 API
How they are used in v1 API
How they are represented internally within SpiceDB

Pagination support for Expand API

Add support for pagination into the Expand API

Should ideally make use of a generic pagination system over the dispatcher that can be shared with Lookup

`time` field in pgx logs is repeated

Right now, the time field in the pgx logs as output is repeated:

{"level":"info","module":"pgx","args":[],"commandTag":"...","sql":"begin read only","time":2.258174,"time":"2021-09-02T16:31:54Z","message":"Exec"}

Note that the produced JSON, when piped through tooling, loses one of the values

Go panics should return a gRPC error

The gRPC recovery middleware can be added catch any panics and convert them into an gRPC internal error.

This requires a careful review to ensure that no resources are leaked as a result of continuing the program after a panic occurs.

Adding a tuple on a permission doesn't fail

reproducer: https://play.authzed.com/s/zT3GO2ufKQf_A/relationships

Support bulk delete in the Write API

The Write API should coalesce multiple deletes into 1 delete, so that a rogue client can't keep us busy for a long time.

Determine behavior when Write is provided duplicate tuples

There is no validation that deduplicates or asserts that every Tuple provided to authzed.api.v0.ACLService.Write() must be unique. Because of this, it is up to the data store to determine whether or not this will manifest itself as an error.

There are two immediately obvious solutions:

Validate that every Tuple is unique and return an error
Deduplicate Tuples before attempting to persist Tuples into the data store

Prefix panicking functions in tuple package with "Must"

There are functions in the tuple package that panic when the invalid input is provided.
These functions should be renamed to have the prefix Must in front of them in order to follow the Go convention.

Zone-aware snapshot selection

From the Zanzibar paper:

When a zookie is not provided, the server uses a default staleness chosen to ensure that all transactions are evaluated at a timestamp that is as recent as possible without impacting latency.
On each read request it makes to Spanner, Zanzibar receives a hint about whether or not the data at that timestamp
required an out-of-zone read and thus incurred additional latency.
Each server tracks the frequency of such out-of-zone reads for data at a default staleness as well as for fresher and staler data, and uses these frequencies to compute a binomial proportion confidence interval of the probability that any given piece of data is available locally at each staleness.
Upon collecting enough data, the server checks to see if each staleness value has a sufficiently low probability of incurring an out-of-zone read, and thus will be low-latency.
If so, it updates the default staleness bound to the lowest “safe” value. If no known staleness values are safe, we use a twoproportion z-test to see if increasing the default will be a statistically significant amount safer. In that case, we increase
the default value in the hopes of improving latency. This default staleness mechanism is purely a performance optimization. It does not violate consistency semantics because Zanzibar always respects zookies when provided.

CockroachDB supports regions and zones. There isn't a way (that I can see) to get statistics on zone information during a single read, but locality data can be fetched within the same transaction with a show range for row query.

Handle CRDB Retry responses

See: cockroachlabs.com/docs/v21.1/transactions#client-side-intervention

Here's an example of spicedb logs when a retry isn't handled:

5:04PM ERR Exec args=[] err="ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh): \"sql txn\" meta={id=caf1be11 key=/Table/54/2/\"QRdiUVdMQmkyjQjX\"/\"user\"/\"...\"/\"resource\"/\"direct\"/\"thegoods\"/0 pri=0.04592771 epo=0 ts=1631394282.849166225,2 min=1631394148.811930011,0 seq=6} lock=true stat=PENDING rts=1631394148.811930011,0 wto=false gul=1631394149.311930011,0 (SQLSTATE 40001)" module=pgx sql=commit
5:04PM ERR finished server unary call error="unable to write tuples: ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh): \"sql txn\" meta={id=caf1be11 key=/Table/54/2/\"QRdiUVdMQmkyjQjX\"/\"user\"/\"...\"/\"resource\"/\"direct\"/\"thegoods\"/0 pri=0.04592771 epo=0 ts=1631394282.849166225,2 min=1631394148.811930011,0 seq=6} lock=true stat=PENDING rts=1631394148.811930011,0 wto=false gul=1631394149.311930011,0 (SQLSTATE 40001)" grpc.code=Unknown grpc.method=Write grpc.method_type=unary grpc.service=authzed.api.v0.ACLService grpc.start_time=2021-09-11T17:02:29-04:00 grpc.time_ms=3712.738 kind=server system=grpc

CockroachDB datastore assumes linearizability

The CRDB datastore works around the assumption that the hybrid logical can be used for ordering all transactions, when this is might not be a safe assumption.

If we assume the "overlapping transactions" are the only things guaranteed to be linearizable the following can occur:
You could issue a ContentChangeCheck, get a Zookie, then immediately do a Check which then had a "non-overlapping transitive relationship change" included in the evaluation of that Check (e.g. a group relationship that was deleted).

There is only a small window where this could actually occur in practice, but that window depends on clock skew.
A simple proposal could be simply waiting a configurable duration until returning a Zookie.

Here are some resources that are fuel for discussion:

CRDB Consistency Model docs
Fun fact: early CockroachDB had a hidden --linearizable switch that would do essentially the above, so theoretically, if you did have some atomic clocks lying around (or generally an acceptable maximum clock offset), you’d get Spanner-like behavior out of the box. We’ve since removed it given how under-tested it was, but perhaps it would make sense to resurrect it as cloud providers trend towards exposing TrueTime-like APIs. Chip-scale atomic clocks are a reality; putting one on server motherboards would beat the pants off a quartz crystal oscillator.
Usenix Exploiting a Natural Network Effect for Scalable, Fine-grained Clock Synchronization
https://clockwork.io
https://github.com/rubrikinc/kronos

Create documentation for trade-offs between datastores

Such a document should include the pros & cons to each datastore and when to use them

Datstores:

In-memory
Postgres
Postgres (Cockroach)
Cockroach
Read-only
Mapping Proxy

Replace calls to pb.Validate() with gRPC validation middleware

We're manually calling protobufMessage.Validate() first thing in many of the API handlers. We can replace all of that by adding the gRPC community validation middleware to the constructors of the gRPC Server.

Optional filters for Expand results

Callers of authzed.api.v0.Expand often only care about a subset of the users present in the results. Expand could provide a means of filtering users, which could potentially improve performance for those requests.

Create unique constraint for living Tuples in Postgres datastore

The native CockroachDB schema guarantees uniqueness of Tuples, while the Postgres datastore does not. Theoretically, it's possible for SpiceDB to create duplicate living Tuples if there is a bug. By creating a unique constraint, we can enforce only a single living Tuple for a relationship at the database level.

Use reviewdog in CI (and locally)

https://github.com/reviewdog/reviewdog

Support OpenTelemetry collectors

Everything is instrumented using OpenTelemetry, but Jaeger is the only format exposed by command-line flags.
If it can be made generic enough, this could be upstreamed into cobrautil.

Namespace config consistency

From the Zanzibar paper:

Because changes to namespace configs can change the results of ACL evaluations, and therefore their correctness, Zanzibar chooses a single snapshot timestamp for config metadata when evaluating each client request. All aclservers in a cluster use that same timestamp for the same request, including for any subrequests that fan out from the original client request.
Each server independently loads namespace configs from storage continuously as they change (§3.1.3). Therefore, each server in a cluster may have access to a different range of config timestamps due to restarts or network latency. Zanzibar must pick a timestamp that is available across all of them. To facilitate this, a monitoring job tracks the timestamp range available to every server and aggregates them, reporting a globally available range to every other server.
On each incoming request the server picks a time from this range, ensuring that all servers can continue serving even if they are no longer able to read from the config storage.

SpiceDB currently caches namespace configs but does not enforce their evaluation at a particular snapshot. Without this in place, changes to namespace configuration needs to be carefully planned.

Expose multi-region configuration for CRDB datastore

One of the major benefits of using CockroachDB is that it can optimize performance for multi-region deployments.

Here are a few unexplored ideas for ways that this could be handled in SpiceDB:

Document configuring CockroachDB out of band
SpiceDB creates database indexes based on CLI flags

v0 Read API does not scale to tens of thousands of results

authzed.api.v0.ACLService.Read() cannot support reading queries with large result sets.

This will have to be added as some form of pagination in order to remain backwards compatible.

A future API could use gRPC streams instead of pagination.

Add demo command

A CLI command that runs a demo of SpiceDB with a schema and data already that generates load would be useful for both load testing and showing users what using the database is like.

This is inspired by the same functionality in CockroachDB.

Normalize the reverse tuple query interface with that of forward tuple queries

It should use filters to build the base query.

Document architectural design

There needs to be a document that expands on the structure of the repository and the design of the project.

The document should:

Reiterate Zanzibar basics
Explain differences between SpiceDB & Zanzibar
Describe the end-to-end lifetime of a check request

Migrate zed-testserver into a subcommand

Now that all of SpiceDB is open source, there's no purpose to the test-server being built as its a separate binary.
Instead it could easily be migrated into subcommand of the main binary.

Enable arbitrary nesting of arrow operators

The arrow operator can only be used once in an expression (e.g. org->admin), but should be able to support traversing an arbitrary number of relations (e.g. namespace->org->admin, tenant->namespace->org->admin->member).

This would allow for nested arrows in schema

Install shell completion with brew

The Homebrew formula generated by go-releaser should also install shell completion.

Modular peer discovery for dispatch

Distributed dispatching currently relies on a single implementation of peer discovery that uses Servok.
Because there are various ways to discover peers, this should likely be placed behind an interface so that more implementations can arise over time.

Tuple Query iterators should stream results rather than buffering

Tuple Query iterators load the full set of tuples from the database before passing the iterator to the various callers.

Not all APIs make full use of the result set, so having a streaming iterator could provide some efficiencies when the limit in a lookup is reached.

TestReadBadZookie is flakey

--- FAIL: TestReadBadZookie (0.01s)
    acl_test.go:266: 
        	Error Trace:	acl_test.go:266
        	Error:      	Received unexpected error:
        	            	rpc error: code = OutOfRange desc = invalid zookie: revision has expired
        	Test:       	TestReadBadZookie

Request hedging

From the Zanzibar paper:

Zanzibar’s distributed processing requires measures to accommodate slow tasks. For calls to Spanner and to the Leopard index we rely on request hedging (i.e. we send the same request to multiple servers, use whichever response comes back first, and cancel the other requests). To reduce round-trip times, we try to place at least two replicas of these backend services in every geographical region where we have Zanzibar servers. To avoid unnecessarily multiplying load, we first send one request and defer sending hedged requests until the initial request is known to be slow.
To determine the appropriate hedging delay threshold, each server maintains a delay estimator that dynamically computes an Nth percentile latency based on recent measurements. This mechanism allows us to limit the additional traffic incurred by hedging to a small fraction of total traffic.
Effective hedging requires the requests to have similar costs. In the case of Zanzibar’s authorization checks, some checks are inherently more time-consuming than others because they require more work. Hedging check requests would result in duplicating the most expensive workloads and, ironically, worsening latency. Therefore we do not hedge requests between Zanzibar servers, but rely on the previously discussed sharding among multiple replicas and on monitoring mechanisms to detect and avoid slow servers

Lookup requests against namespaces without type information should fail with an error

Currently they silently return no results.

Garbage collection for Postgres data driver

The postgres data driver currently keeps the full history of all relationships written and deleted. It should allow for the specification of a garbage collection policy, indicating the point after which deleted relationship rows should be removed

Dispatching Lookup should stream results

Add support for pagination into the Lookup API

Should ideally make use of a generic pagination system over the dispatcher that can be shared with Expand

See #42 for Expand

Add "public" keyword/type

The Zanzibar implementation at Google uses a special-case userset to represent the set of all users "aka public".

As per one of their public presentations:

Because SpiceDB's schema language is more expressive, we have some better options than introducing this concept as a special-cased tuple:

A keyword could be used to embellish relations/permissions that are public.
We could introduce a type to represent public, but it might be surprising if a user unintentionally unions a relation/permission with public by accident.

Hedge against slow or failed datastore requests

In the case of bad network performance, cache misses, or a variety of invalid assumptions that could cause latency, the distributed dispatcher should also dispatch to another instance as a "hedged bet" to guarantee a response with the least amount of latency as possible.

checkDirect optimizations

In https://github.com/authzed/spicedb/blob/main/internal/graph/check.go#L61, we perform a lookup of all relationships under a relation, and then search for the goal subject, as well as kicking off the recursive checks if a non-terminal subject is found.

We should consider making the following optimizations:

Check for the goal subject directly in a single query, without a loop, and only if that subject relation is possible on the relation (via type information, if present)
Lookup only the non-terminal ONRs in its own query and use those results to issue recursive checks

Expose reflection service without authn middleware

SpiceDB uses the gRPC authentication middleware to protect its services. The reflection service should be exposed without requiring authn. A mixin from grpcutil is used to carve out other exceptions to the authn middleware.

Add flags for bootstrapping with a validation file

We currently use validation files for testing and importing/exporting databases from the playground.

The UX for running an instance of SpiceDB would be improved if there was a command-line flag (--bootstrap-file) that took a path to a validation file and loaded it on start-up. An additional --bootstrap-overwrite flag would force the validation to run regardless of whether or not the data store was empty at startup.

Implement a dedicated gRPC API for dispatch

Distributed dispatching currently reuses the existing gRPC API and uses headers to propagate context.
It might make sense to design an API dedicated for dispatching that encapsulates this context and is agnostic to the API with which the end-user made a request.

All Go files should pass linting from revive

revive is being ran in GitHub Actions, but it is not failing the build if there are any linting errors.

Lots of docstrings need to be added among any other changes and then the step in GHA needs to be updated to fail the build if a new linting error is introduced.

Add Linting and Warnings to the Development Package

The authzed.api.v0.DeveloperService, which is used to power the Authzed Playground and any schema language tooling, supports formatting and compilation errors, but it does not include any linting or warning information that can be used to optionally improve the designs of schemas.

Speculative: Run transaction rollback asynchronously

Right now, the transaction used in the CRDB driver to set the query to use the proper time is rolled back synchronously via a defer tx.Rollback(ctx) call in the query code: https://github.com/authzed/spicedb/blob/main/internal/datastore/common/sql.go#L248

During performance testing it was noticed that this rollback can take anywhere from 0.5ms to 3ms, so it might be possible to get a small win if we can run the rollback asynchronously while the queried data is returned to the caller. This may not be possible given the connection pool in use; we'd need to investigate if this would cause any secondary effects.

Datastore interface should have more fine grained control over transactions

Without exposing a better API that can enable calls to span across the same transaction, API handlers cannot make perfect guarantees about the consistency of the data they are saving.

A strawman API would look something like this, using a closure to get any data in and out:

// within an API handler
err := datastore.Transact(func(ctx context.Context, tx datastore.Tx) error {
    query := tx.QueryTuples(...)
    // do stuff with query
    tx.DeleteRelationships(...)
})

API v1 Requirements

This issue is meant to track design requirements for the v1 API.

Support for schema language instead of namespace configs
~~All API calls are consistent by default~~
Zookies optionally can be provided to safely allow flexibility in consistency
Subject References should treat a lack of relation as "ellipsis"

Consistent CLI flags with zed

There are a variety of command-line flags that are shared across Authzed projects that do the same thing, but are named differently.

This issue should track documenting those shared flags and updating them so that everything is consistent.
Where it makes sense, flags can be upstreamed into cobrautil, like the zerolog flags already are.

Push releases to various container registries

There are various registries with different retention policies for container images.
It is probably worth setting up redundant repositories to hedge against losing any images.

Potential registries:

~~Quay~~ (already being used)
Docker Hub
GitHub

Setup CI pipeline for benchmarking performance

This requires a few steps:

Converting our existing ad-hoc process for performance testing into a GitHub Action
Connecting GitHub Actions to dedicated hardware that can measure performance without noisy-neighbor issues
Formalizing the output of performance tests
Determining how and when these tests run
Determining how to produce action items from tests that are ran

authzed / spicedb Goto Github PK

spicedb's Introduction

SpiceDB sets the standard for authorization that scales. Scale with Traffic • Dev Velocity • Functionality • Geography

What is SpiceDB?

Why SpiceDB?

Joining the Community

Learn

Contribute

Getting Started

Installing the binary

Running a container

Deploying to Kubernetes

Developing your own schema

Trying out the API

Acknowledgements

spicedb's People

Contributors

Stargazers

Watchers

Forkers

spicedb's Issues

Recommend Projects

Recommend Topics

Recommend Org

SpiceDB sets the standard for authorization that scales.

Scale with
Traffic • Dev Velocity • Functionality • Geography