Git Product home page Git Product logo

go-syncstorage's Introduction

CircleCI

Mozilla Sync 1.5 Storage Server in Go

Go-syncstorage is the next generation sync storage server. It was built to solve the data degradation problem with the python+mysql implementation. Logical separation of data is now physical separation of data. In go-syncstorage each user gets their own sqlite database. Many indexes were harmed in the making of this product.

Installing and Running it

The server is distributed as a Docker container. Latest builds and releases can be found on Dockerhub.

Running the server is easy:

$ docker pull mozilla/go-syncstorage:latest
$ docker run -it \
  -e "PORT=8000" \                           [1]
  -e "SECRETS=secret0,secret1,secret2" \     [2]
  -e "DATA_DIR=/data" \                      [3]
  -v "/host/data/path:/data" \               [4]
  mozilla/go-syncstorage

Only three configurations are required: PORT, SECRETS and DATA_DIR.

  1. PORT - where to listen for HTTP requests
  2. SECRETS - CSV of secrets preshared with the token service
  3. DATA_DIR - where to save files (relative to inside the container)
  4. A volume mount so data is saved on the docker host machine

More Configuration

The server has a few knobs that can be tweaked.

Env. Var Info
HOST Address to listen on. Defaults to 0.0.0.0.
PORT Port to listen on
DATA_DIR Where to save DB files. Use an absolute path. :memory: is also valid and saves sqlite databases in RAM only. Recommended only during testing and development.
SECRETS Comma separated list of shared secrets. Secrets are tried in order and allows for secret rotation without downtime.
LOG_LEVEL Log verbosity, allowed: fatal,error,warn,debug,info
LOG_MOZLOG Can be true or false. Outputs logs in mozlog format.
LOG_DISABLE_HTTP Can be true or false. Disables logging of HTTP requests.
HOSTNAME Set a hostname value for mozlog output
LIMIT_MAX_REQUESTS_BYTES The maximum size in bytes of the overall HTTP request body that will be accepted by the server.
LIMIT_MAX_BSO_GET_LIMIT Max BSOs that can be returned per GET request. Default: 2500.
LIMIT_MAX_POST_BYTES Maximum size of a POST request. Default: 2097152 (2MB).
LIMIT_MAX_POST_RECORDS Maximum number of BSOs per POST request. Default 100.
LIMIT_MAX_TOTAL_BYTES Maximum total size of a POST batch job. Default: 26,214,400 (20MB).
LIMIT_MAX_TOTAL_RECORDS Maximum total BSOs in a POST batch job. Default 1000.
LIMIT_MAX_BATCH_TTL Maximum TTL for a batch to remain uncommitted in seconds. Default 7200 (2 hours).

Advanced Configuration

Things that probably shouldn't be touched:

Env. Var Info
POOL_NUM Number of DB pools. Defaults to number of CPUs.
POOL_SIZE Number of open DB files per pool. Defaults to 25.

go-syncstorage limits the number of open SQLite database files to keep memory usage constant. This allows a small server to handle thousands of users for a small performance hit.

Multiplying POOL_NUM x POOL_SIZE gives the maximum number of open files.

A low level lock is used in each pool when opening and closing files. Having a larger POOL_NUM decreases lock contention.

When a pool reaches POOL_SIZE number of open files it will close the least recently used database. Having a larger POOL_SIZE reduces open/close disk IO. It also increases memory usage.

Tweaking these values from default won't provide significant performance gains in production. However, a POOL_NUM=1 and POOL_SIZE=1 is useful for testing the overhead of opening and closing databases files.

Data Storage

When deploying choose the EXT4 filesystem. EXT4 is an extent based filesystem and may help improve performance for magnetic storage media.

go-syncstorage gives each user gets their own sqlite database. On a production server that enough files to be a real burden for a human when troubleshooting. Thus, files are created into a directory structure like this:

/data-dir/
   00/
   01/
   34/
     21/
       100001234.db
   ...
   99/
  • Two levels of subdirectories, each with 100 subdirectories for total of 10,000 sub-directories.
  • The user, 100001234, is located at 34/21/100001234.db. The path starts at the reverse of their id. Their id is used for the actual database name.
  • Using the reverse order helps evenly balance the number of files per directory.

Using this scheme, one million users will only have 10,000 files per directory. This is a relatively low number that CLI tools like ls will have no trouble with. Always optimize for the proper care and feed of your sysadmins.

Other Releases

A linux binary is also available as build artifacts from Circle CI.

License

See LICENSE.

go-syncstorage's People

Contributors

mostlygeek avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.