clusterhq / flocker Goto Github PK

View Code? Open in Web Editor NEW

3.4K 3.4K 286.0 58.8 MB

Container data volume manager for your Dockerized application

Home Page: https://clusterhq.com

License: Apache License 2.0

Python 98.70% Shell 0.49% Groovy 0.80%

flocker's People

Contributors

Stargazers

Watchers

Forkers

digitalsatori tombh glyph ereztourjeman nvdnkpr zhgwenming networkelements yekeqiang th3architect emrul cultofmetatron schevalier lukemarsden rayleyva sheltowt wdxxs2z alex-docker rootfs-analytics rootfs ipanousis miguelramosfdz xuzhaokui ksoona gnomix jianyongchen yilab bossjones milliant hanwoody max-dev pombredanne nimishzynga tomprince piaohai dm04806 tomastomecek verchol irfanj somesky lemonhall tomzhang hcxiong mbrukman dsalisbury markandrewj drooids mnjstwins mikewallace1979 ruckuus dalanlan laynepeng rmoorman jaggerliu aminembarki picorb ruo91 hungld ramitsurana laonda noyb34 alsedlah ruarcs waynelemars rowhit wallnerryan runcom robhaswell liuyang430068 jml gideonmay achanda jos666 baicai-ai changjiashuai sedouard ipv1337 odin1314 myarchivedprojects lhcg fzuslide liudch duke-lv bluelover-zm j0x47 sloblee root2000xyz dennywangdengyu rahulxkrishna nagasrikanthtalluri therriend scollison cloudxtreme moypray jontkong jeromeshi andyhu1199 w4ngyi kalko91 hackday-profilers punalpatel

flocker's Issues

Support IPv6 routing

Replaced by https://clusterhq.atlassian.net/browse/FLOC-62

Any and every public API in flocker.route that accepts an IPv4 address should be made to work if it is called with an IPv6 address instead.

Delete a volume

Replaced by https://clusterhq.atlassian.net/browse/FLOC-14

Delete a volume on the local node.

Setup buildbot for automated testing

The following tests must be performed on master each time a new branch is merged in:

Unit testing
Integration tests
Acceptance tests

Ideally these would be performed before a branch is merged, but I suspect we do not have the infrastructure for that.

Use tox to run tests

tox allows a broader range of testing (e.g. interacting with installed programs), means you don't have to maintain your own virtualenv, and means more of the automated testing can be run locally.

Tests leave behind files with permissions that don't allow reading `_trial_temp`

This breaks tox.

Create snapshots of a ZFS filesystem

We need to be able to take snapshots in order to do pushs. Code exists in zfs-snapshots-70790688 branch.

Once a successful tox build has been run on your repository, the output of the documentation builder is written to ${REPO}/docs/_build/html/index.html. I feel that this should be written into the README file in order to be discoverable to end-users.

Perhaps this means instructions on building the docs should go into the README.

Blockers:

#265

Push a volume to a remote node

Push a locally owned volume to a remote node. The remote node will not have it mounted, since it does not own the volume, it merely has a copy.

flocker-reportstate: report local state of a node

To be able to update deployments you need to be able to figure out what the current state of the deployment is.

To this end, introduce a command, provisionally spelled flocker-node inspect-local-state, which can describe the flocker-related deployment state of the node it is run on.

Flocker deployment configuration looks something like this:

{"mysql-hybridcluster": {"node": "node1"}, {"site-hybridcluster.com": {"node": "node2"}}

(the exact syntax isn't relevant).

Flocker application configuration looks something like this:

{"mysql-hybridcluster": {"image": "hybridlogic/mysql5.9", "volume": "/var/run/mysql"},
 "site-hybridcluster.com": {
    "image": "hybridlogic/web",
    "volume": "/var/log",
    "internal_links": {3306: {"container": "mysql-hybridcluster", "port": 3306}},
    "external_links": {"external": 80, "internal": 80},
 }
}

flocker-node inspect-local-state needs to produce something like this (ideally, something exactly like how deployments and applications are described, though the structure of that information has not yet been finalized) describing what is actually set up on the node at the time the command is run.

This allows centralized management to aggregate all information across the cluster (necessary for the initial simple management interface flocker will present).

The output of this command will make it possible to make decisions about what changes must be made to get the node into the desired state.

Since only cross-node information we really need for first release is volume location, only blocked on:

#286 Discovering local volume configuration
#271 YAML structure for volume configuration

List volumes on a node

It should be possible to enumerate all volumes stored on a specific volume manager node.

flocker-route create: create a new external route

We presume DNS is configured to point at all IPs in the cluster. An external route redirects traffic from a specific port (e.g. 80) to the IP/port where it is served.

Some Flocker tests can only run as root

Certain tests create ZFS filesystems or reconfigure the system's network stack. These are privileged operations so the tests are skipped if the test process isn't privileged (running as root).

The continuous integration setup probably needs to account for this and find some way get these tests to run.

Mark all Flocker-created iptables rules with a comment

Replaced by https://clusterhq.atlassian.net/browse/FLOC-56

Forwarding a port involves creating three iptables rules. One of these gets a comment reflecting that Flocker created it. The other two should as well.

Replace pyflakes with flake8 in tox setup, to ensure PEP8 compatibility

This will involve fixing some code that is currently in Twisted coding standard.

Implement Docker client based on docker-py and threads

Re-implement the IGearClient interface using Docker directly via the docker-py library.

Archive the volume manager data model

flocker-route list: list all routes on the local machine

Start and stop a container ("unit") via geard

We're going to start/stop containers using geard. We need code to do that.

Buildbot should build an RPM for every successful merge into master

Once a branch has been merged into master and passed automated, integrated and acceptance testing, and RPM must be built.

This RPM should then be uploaded to the our public repository.

Wait a for a volume to be handed off to the current node

A remote node may push and then handoff a volume to the local node. The volume manager should be able to wait until this happens so some action requiring that volume can proceed.

Create vagrant image with all dependencies installed

For internal testing we need easy to setup environments. Once we do a release we will want an easy way for users to test our software.

We should therefore have some automation (since we'll need to regenerate this as new releases come out) and resulting image of a Fedora 20 machine with Docker, ZFS, geard etc. already setup.

Volumes' filesystems should be owned by the user which the corresponding container will run as

Replaced by https://clusterhq.atlassian.net/browse/FLOC-34

The process running in a container runs as a specific user. E.g. http://docs.docker.io/reference/api/docker_remote_api_v1.11/#22-images - "Inspect an image" API includes user name the image will run under. The filesystem mounter for the container must therefore be writeable by this user.

Our current solution is chmod 777, but that's not secure. Better to chown the filesystem to (or maybe tell ZFS about) the appropriate user.

Initialize the volume manager with a UUID

Each volume manager node should have a UUID so that volumes from different nodes can be distinguished.

This issue will also cover setting up some minimal infrastructure like a command-line tool and documentation location.

Deleting a volume should delete all remote instances of that volume

Replaced by https://clusterhq.atlassian.net/browse/FLOC-55

Remote copies of volumes should aim to track the masters - so that if the master is deleted all remote copies are deleted as well. This would avoid confusing conflicts when a volume is deleted and created again with the same name.

To protect against the scenario that a local node has been compromised by an attacker, remote volume destruction could take the form of renaming the filesystem into the trash. This is the approach taken in HybridCluster.

flocker-cli deploy: entry point to orchestration management

Blocked by:

#203 model config from yaml config blobs
#220 read and validate deployment config
#221 configure ssh on all nodes

Take application configuration and deployment configuration. Connects to all nodes, asks for their local state, combines it into global current state and then tells all nodes to do what is necessary to change to new configuration.

laptop$ cat deploy.cfg
{"version": 1, "mysql-hybridcluster": {"node": "node1"}, {"site-hybridcluster.com": {"node": "node2"}}

laptop$ cat app.cfg
{"version": 1,
 "mysql-hybridcluster": {"image": "hybridlogic/mysql5.9", "volume": "/var/run/mysql"},
 "site-hybridcluster.com": {
    "image": "hybridlogic/web",
    "volume": "/var/log",
    "internal_links": {3306: {"container": "mysql-hybridcluster", "port": 3306}},
    "external_links": {"external": 80, "internal": 80},
 }
}

laptop$ flocker-cli go deploy.cfg app.cfg
  # ssh node1 flocker-node inspect-local-state
  #   {}
  # ssh node2 flocker-node inspect-local-state
  #   {}

  # CURRENT_CONFIG={"node1": {}, "node2": {}}
  # DEPLOY_CFG=$(cat deploy.cfg)
  # APP_CFG=$(cat app.cfg)
  # ssh node1 flocker-node change-local-state "${CURRENT_CONFIG}" "${DEPLOY_CFG}" "${APP_CFG}"
  # ssh node2 flocker-node change-local-state "${CURRENT_CONFIG}" "${DEPLOY_CFG}" "${APP_CFG}"

flocker-node: receive global state of stateless containers and apply locally (no links, routes or volumes included)

This covers the subset of #12 that only involves starting and stopping containers.

Insofar as we need to figure out difference between current state and requested state, this depends on #32.

Create a volume

Create a new volume on the local machine.

Delete existing code in repository

It is no longer relevant.

Publish documentation on clusterhq.com

flocker-node: list local containers managed by flocker

This is the subset of #11 that involves listing containers.

Handoff a volume to a remote node

A locally owned volume has been pushed to a remote node. Handoff indicates that remote node is now the owner of the volume.

Unmount local volume.
Do a final push.
The local volume becomes a "remote" volume, and therefore is unmounted.
The volume on the remote node becomes a local node, and is therefore mounted.

Support more than one volume for a container

Replaced by https://clusterhq.atlassian.net/browse/FLOC-49

Our current implementation has a 1:1 mapping between containers and volumes:

For example, for a container named "myapp-mongodb" a volume called "myapp-mongodb" will be created.

It's likely that users will want the ability to mount multiple volumes within their containers. Each volume would need a different mountpoint. Flocker should have syntax and support for this.

Tool to inspect local state of a node

Remove machinist as dependency

Current release is incompatible with newer Eliot, and we're not actually using it in useful way currently.

Benchmark the volume push API

Replaced by https://clusterhq.atlassian.net/browse/FLOC-70

People will enjoy Flocker more if it completes the requested operations quickly. Shoveling ZFS snapshot data streams between storage pools is one operation Flocker will be executing (and somewhat frequently). It would be beneficial for this to be as fast as possible.

Measure how fast it is as a first step towards making sure it is and stays as fast as possible.

It's likely not possible to transfer the snapshot data at a rate greater than the lesser of either the disks holding the storage pools (or perhaps half that if the storage pools share disks as might be the case for a "loopback" benchmark) or the network connecting two hosts between which the transfer is occurring. If the benchmark reveals we're not close to that - say, within 20% - then we should consider doing some optimization work

See #69 for an earlier version of this issue.

Finish #21

flocker-route delete: Delete an external route from the local machine

`flocker go` should setup all nodes to be able to SSH to each other

In order for the volume manager to be able to push volumes to other nodes, we will (as a first pass) need nodes to be able to ssh into each other.

flocker deploy should therefore configure the nodes to support this. This should be done automatically when flocker deploy determines it is necessary.

For this issue, implement an API which will check the state of a single node and put any necessary configuration on it. flocker deploy will invoke this for each node it is going to communicate with before it tries to enact any changes.

The API will be a blocking function that runs the OpenSSH command line client to accomplish the goals.

The API will:

Generate a keypair (with ssh-keygen) and save it in ~/.ssh/id_rsa_flocker{,.pub} (on the admin host, eg my laptop)
Connect to the host (with ssh) as root (necessary authentication pieces previously configured by the admin)
Check for /etc/flocker/id_rsa_flocker and /etc/flocker/id_rsa_flocker.pub. If they are missing or contain the wrong key, put the right key in place.
Check for an entry for the key in /root/.ssh/authorized_keys. If one is not found, add one for the right key.

Persist network configuration (routes) for containers across system reboots

Replaced by https://clusterhq.atlassian.net/browse/FLOC-38

Any time we set up a proxy on a port to get traffic routed to the right host so it can arrive at the right container, we need to make sure the proxy is going to be running the next time the system boots. At least, assuming we want containers to automatically start on the next boot (which is how gear sets them up, so probably).

Current proxying strategy uses iptables rules. These won't necessarily persist across boots. Perhaps we should write them to a systemd unit file - or maybe we just need to invoke iptables-save (or the systemd unit responsible for this) at the right time.

Expand `IStoragePool.create` to document expected error results

Replaced by https://clusterhq.atlassian.net/browse/FLOC-59

Some errors (e.g. creating already-existing filesystem) are pretty standard and should be explicitly exposed and documented.

Automatically find all packages to install in setup.py

I keep forgetting to add new packages to setup.py list of packages to install. We should just automatically construct that list.

Incrementally update a local volume on a remote node

A locally owned volume has previously been pushed to a remote node. The local volume has received some changes and a push to the remote node has been requested again. There would be performance gained from re-using the existing data. An incremental ZFS send (``zfs send -i`) makes this possible.

In order to update the remote node the local node must know the latest snapshot that the remote node has for this volume. That also implies that the snapshot taken at the time of the last push has not been removed from the local node, and that we are therefore collecting a possibly unbounded number of snapshots for each volume - equal to the number of sends which have been performed.

Blocked on:

#93 don't mount pushed volumes

Sub-tasks:

#657 - Add support for generating incremental streams when pushing volumes
#658 - Add support for communicating snapshots from the receiver to the sender.
#669 - Use snapshot information to generate incremental streams when pushing

Users might like a UI for creating service description files

Certain users might find it easier to create their service description files using a point-and-click tool, rather than writing a text file.

The marketing website could also benefit from displaying a nice looking tool with a demo.

Some tools for this purpose already exist, we should consider integrating them:

Gaudi.io: http://gaudi.io/builder.html, source: https://github.com/marmelab/gaudi.io/tree/gh-pages
Juju: https://jujucharms.com/, source: https://github.com/juju/juju-gui

Create common infrastructure for command-line tools

All our command line tools need to:

--version
Bad input to exit and print error message and then usage help.
Run the reactor.

And perhaps other shared stuff as well. There should exist Usage class decorator and tests for this functionality. Some of it is already implemented in flocker-volume and can be reused.

flocker-node: receive global state and new configuration and then execute all necessary actions on local node

flocker-node is the CLI (later a daemon) that runs on each node and manages it.

Once flocker-cli has received new config and figured out existing configuration, it can tell this tool (on each node) to do necessary local changes to make this configuration work.

Blocked on:

Structured config object from yaml config blobs - #203
Learn current local state - #209
Start/stop containers - #252
Create external ports configuration #247
Create, push or wait for volumes, and expose/unexpose as necessary #272

Newly created volumes should be automatically exposed as a Docker container

If we create volume myapp with container-facing mount point of /var/lib/mongodb, we should create a new Docker container called myapp-data that has a -v /path/to/myapp/realmountpoint:/var/lib/mongodb:rw option. This will then automatically be mounted when someone creates a geard unit called myapp.

This requires:

VolumeService.create() should take the mountpoint for the volume - this is not the current mountpoint implemented in #13, but rather the location where this volume should be mounted within a container. So some terminology tightening is probably also necessary.
Talking to Docker over a Unix socket. This requires trunk Twisted.

Switch away from python-iptables

Unfortunately it seems quite buggy. Changes randomly don't get made or don't get noticed, error output from the underlying C library randomly pops up on stderr, etc.

Just run iptables in a child process instead. 😢

Benchmark `zfs send`

There is folk wisdom that the performance of zfs send is closely related to whether it is able to fill up its output buffer or not (if it is able to fill it then it performs poorly - particularly if emptying that buffer involved network round-trips).

HybridCluster suffered from performance problems until it added a buffering layer between zfs send and the network.

Measure the throughput of zfs send on a well-known dataset against various sizes of output buffer.

Generate machine-parseable output for consumption by a continuous benchmarking system (sadly yet-to-be-implemented).