Git Product home page Git Product logo

zalando-stups / taupage Goto Github PK

View Code? Open in Web Editor NEW
44.0 20.0 50.0 61.2 MB

THIS PROJECT IS NOT LONGER ACTIVELY MAINTAINED - The base Amazon Machine Image (AMI) allowing dockerized applications to run with STUPS

Home Page: https://docs.stups.io/en/latest/components/taupage.html

License: Other

Shell 29.37% Ruby 2.51% Python 34.13% Perl 28.25% Makefile 1.05% Go 0.96% Dockerfile 0.17% HTML 3.56%
aws ami stups

taupage's Introduction

Zalando AMI generation tooling

Build Status Join the chat at https://gitter.im/zalando-stups/taupage

NOTE

While we do not consider Taupage to have reached its end of life, we are currently not planning to add new features or extend its functionalities. We will consider any pull request, but we do not plan to directly work on enhancements.

Prerequisites

You need to have jq and AWS CLI preinstalled.

Build a new AMI

You need to be logged in (mai login). (As a configuration example, you can find the file config-stups-example.sh in the code base. You should modify it to suit it to your needs.)

$ ./create-ami.sh ./config-stups.sh <version>

This will spin up a new server, configure it, create an AMI from it, terminate the server and share the AMI. If you want to debug the server after setup, you can add a --dry-run flag: AMI generation, terminating and sharing will be skipped.

$ ./create-ami.sh --dry-run ./config-stups.sh

See the STUPS documentation for more information.

Directory structure

  • /build/ (scripts and files for the initial setup)
    • setup.d/ (all setup scripts that get executed on the server)
  • /runtime/ (everything, that has to be present during runtime)
  • /tests/ (contains various tests, such as python, serverspec and shell script tests)

taupage's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

taupage's Issues

[scalyr] Make log parser configurable in user data

The parser to use for the application.log is hardcoded to 'slf4j'. We have different log formats for our applications and would like to introduce multiple parsers for that.

This could be configurable in the user data just like the scalyr api token.

Add --net=host option for starting the docker container

we need such a feature for docker command for our Zookeeper appliance

for example we want to do this by senza create:

sudo docker run -p 2181:2181 -p 2888:2888 -p 3888:3888 -d --net=host efd7a9692031

we need to add this --net=host

Thanks

mdadm 10-prepare-disks.py fails

The "prepare-disks" script seems to fail (sometimes) when running mdadm (to build a raid with two devices):

Jun 15 08:10:27 ip-172-31-15-220 kernel: [   23.830888] blkfront: xvdf: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
Jun 15 08:10:27 ip-172-31-15-220 kernel: [   23.834668]  xvdf: unknown partition table
Jun 15 08:10:29 ip-172-31-15-220 kernel: [   25.728675] blkfront: xvdi: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
Jun 15 08:10:29 ip-172-31-15-220 kernel: [   25.733076]  xvdi: unknown partition table
Jun 15 08:10:31 ip-172-31-15-220 kernel: [   28.031018] blkfront: xvdg: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
Jun 15 08:10:31 ip-172-31-15-220 kernel: [   28.035338]  xvdg: unknown partition table
Jun 15 08:10:29 ip-172-31-15-220 taupage-init: message repeated 2 times: [ Waiting for /dev/xvdh to stabilize]
Jun 15 08:10:32 ip-172-31-15-220 taupage-init: mdadm: Cannot find /dev/xvdh: No such file or directory
Jun 15 08:10:32 ip-172-31-15-220 ntpd[1534]: ntpd exiting on signal 15
Jun 15 08:10:32 ip-172-31-15-220 taupage-init: Traceback (most recent call last):
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "./init.d/10-prepare-disks.py", line 237, in <module>
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     main()
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "./init.d/10-prepare-disks.py", line 230, in main
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     handle_volumes(args, config)
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "./init.d/10-prepare-disks.py", line 206, in handle_volumes
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     handle_raid_volumes(volumes.get("raid"))
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "./init.d/10-prepare-disks.py", line 193, in handle_raid_volumes
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     create_raid_device(raid_device, raid_config)
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "./init.d/10-prepare-disks.py", line 184, in create_raid_device
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     subprocess.check_call(call)
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:   File "/usr/lib/python3.4/subprocess.py", line 557, in check_call
Jun 15 08:10:32 ip-172-31-15-220 taupage-init:     raise CalledProcessError(retcode, cmd)
Jun 15 08:10:32 ip-172-31-15-220 taupage-init: subprocess.CalledProcessError: Command '['mdadm', '--build', '/dev/md/sampleraid1', '--level=1', '--raid-devices=2', '/dev/xvdh', '/dev/xvdi']' returned non-zero exit status 1
Jun 15 08:10:32 ip-172-31-15-220 taupage-init: Failed to start 10-prepare-disks.py
Jun 15 08:10:33 ip-172-31-15-220 kernel: [   30.036629] blkfront: xvdh: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
Jun 15 08:10:33 ip-172-31-15-220 kernel: [   30.041170]  xvdh: unknown partition table

Remove Loggly support

We are not using Loggly anymore (we were not satisfied with their performance), thus we should remove support for it in the Taupage AMI.

Implement optional health check for etcd discovery

The etcd discovery process should take an optional health check configuration from the taupage config like "etcd_health_check" which, when set, should be used to also check and only on success to ping the etcd cluster.

Add error handling for build process

ATM the build process will mostly continue in case of errors.

I would like to suggest to change so that any error - and especially unhandled error - will abort the build process.

A good start is to set

set -u -e -E -C -o pipefail

in create-ami.sh, setup.sh, test.sh etc.

Enhance AMI tests

The AMI should be properly tested with Serverspec. Therefore additional tests have to be written and tests need to run, when the image was created, before starting the HTTP server.

TypeError: argument of type 'int' is not iterable

Jun 15 07:55:12 ip-172-31-12-207 taupage-init: Traceback (most recent call last):
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: File "/opt/taupage/runtime/Docker.py", line 297, in
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: main(args)
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: File "/opt/taupage/runtime/Docker.py", line 278, in main
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: cmd += list(f(config))
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: File "/opt/taupage/runtime/Docker.py", line 132, in get_port_options
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: if '/' in host_port:
Jun 15 07:55:12 ip-172-31-12-207 taupage-init: TypeError: argument of type 'int' is not iterable

rkhunter

during build, index filesystem, during runtime check filesystem

ETCD_URL not set in container

If "etcd_discovery_domain" is specified and the etcd proxy gets started, the container still does not get the "ETCD_URL" set. Should point to the running etcd proxy (that by purpose was started on the docker0 IP).

Scalyr agent is installed on instance start -> should be baked into the AMI

Check the following process list:

4 0 1237 1 20 0 4440 644 wait Ss ? 0:00 /bin/sh -e -c /opt/zalando/init.sh 2>&1 | logger -t zalando-init /bin/sh
0 0 1238 1237 20 0 4440 704 wait S ? 0:00 _ /bin/sh /opt/zalando/init.sh
0 0 1431 1238 20 0 4440 656 wait S ? 0:00 | _ /bin/sh ./init.d/81-register-scalyr-agent.sh
4 0 1458 1431 20 0 18020 1640 wait S ? 0:00 | _ bash /opt/zalando/installfiles/install-scalyr-agent-2.sh --set-api-key 0xxx --star
0 0 1781 1458 20 0 61228 32288 poll_s S ? 0:00 | _ apt-get -y install scalyr-agent-2
0 0 1786 1781 20 0 182260 4396 poll_s S ? 0:00 | _ /usr/lib/apt/methods/https

Configurable realms

The realm that is used to get an OAuth2 access token (e.g. for pulling the docker image form pierone) should be configurable (on AMI build time), similar to the token service url.

Auto decrypt environment variables using KMS

It would be a nice feature if Taupage image would decrypt environment variables using Amazons 'kms' before setting them for the docker run time.

This would basically solve storing credentials and handling them for a lot of frameworks.

Add failure strategy

Add configuration options for failure handling (the Docker container stops running). Possible handling:

  • Restart the container endlessly
  • Restart the contaienr X times and then shut down
  • Shutdown immidiatly

Currently, if a Docker image breaks, the server just keeps running without application.

Lower the timeout to reach fullstop

We recently had issues with fullstop. During init the taupage config is pushed to fullstop for auditing. This call usual took ~5min to timeout.

This timeout should be lowered allowing the instance startup to fail faster otherwise this will compete with the healthchecks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.