capgemini / apollo Goto Github PK
View Code? Open in Web Editor NEW:rocket: An open-source platform for cloud native applications based on Apache Mesos and Docker.
Home Page: http://capgemini.github.io/devops/apollo/
License: MIT License
:rocket: An open-source platform for cloud native applications based on Apache Mesos and Docker.
Home Page: http://capgemini.github.io/devops/apollo/
License: MIT License
Create high level roadmap where we can dump desired features into
At the moment we have a set of bash scripts (which works) but is slightly clunky / ugly.
We should be able to switch over to ansible fairly easily. This would allow us to -
ssh "$node" "echo '{\"ui_dir\": \"/opt/consul-ui\", \"server\": true, \"bootstrap_expect\": ${nodes}, \"service\": {\"name\": \"consul\", \"tags\": [\"consul\", \"bootstrap\"]}}' >/etc/consul.d/bootstrap.json"
For the second one we could use potentially use this https://github.com/adammck/terraform-inventory (AWS only at the moment) which generates a dynamic ansible inventory based on a terraform state file.
A bug in the consul config. Cannot connect to the web UI.
Similar to this issue - hashicorp/consul#599
Probably just needs a config edit (around the bind address)
We should be able to demo chaos monkey. The demo would be -
start up a bunch of containers
start up the chaos monkey container
watch some containers die
watch mesos reprovision those containers automagically
Could possibly borrow some stuff from here -
https://gist.github.com/andyshinn/92f9175a8cc79185314e
Not sure of the implications or extra hoops required to run mesos on coreOS. Seems some people have done some experimentations -
http://mesosphere.com/docs/tutorials/mesosphere-on-a-single-coreos-instance/
https://groups.google.com/forum/#!searchin/coreos-user/mesos/coreos-user/hsjuvrL8eM0/wgHVVEARxnYJ
https://github.com/veverjak/coreos-mesos-marathon
e.g.
performance testing
monitoring
logging
It would be nice if we had pluggability so that we could mix and match our infrastructure on-demand.
For example we should be able to pick and choose which mesos frameworks to install http://mesosphere.com/docs/frameworks/ at runtime.
This should be possible by extracting stuff out to terraform modules https://www.terraform.io/docs/configuration/modules.html , and then composing them as we wish.
Later down the line we might have some higher order tool (command line / UI) do the composing of the terraform plan, I dont think thats needed imminently, but a way forward for implementing the underlying pluggability is desired.
Similar to our jenkins setup internally and kinda like this https://mesosphere.com/blog/2015/04/02/continuous-deployment-with-mesos-marathon-docker
We should probably create a simple demonstrator github repo - and spin out the other necessary bits in AWS. Maybe we could use a cloud CI (e.g. wercker / travis / circleCI / drone.io / or something similar so we don't have to do the jenkins heavy lifting).
If we hooked this up to quay.io for the docker registry then we could build / push from the cloud CI to quay.io and trigger a deployment to marathon.
Wercker supports deployments - maybe we could write a custom deployment step for that to deploy into AWS (needs more investigation)
I think there will be some issues to sift through around virtualbox (as there are some virtualbox scripts getting added in those)
Needs to be deployed on each slave
we might want to wait on this hashicorp/terraform#1329
If we can use the michaelpage site - that would be nice
Basically mysql / memcache / drupal in containers that we can deploy to the stack. Depends on #53
For example if you visit http://10.0.1.11:5050 and .13 is the master. Mesos will try redirecting you, it will redirect to -
http://ip-10-0-1-13.eu-west-1.compute.internal:5050/
Which is not resolvable through the browser / VPN.
A similar thing occurs if you click on a link from the web interface to one of the frameworks. We probably need to set -
/etc/mesos-master/hostname
/etc/marathon/conf/hostname
to addresses that we can resolve properly.
At the moment we have this -
"echo ${var.master_ips.master-0} >> /home/ubuntu/masters",
"echo ${var.master_ips.master-1} >> /home/ubuntu/masters",
"echo ${var.master_ips.master-2} >> /home/ubuntu/masters"
It would be better if we could somehow lookup ${var.master_ips.*} using count.index or something similar (if possible)
Might be able to reuse a function from over here
https://www.terraform.io/docs/configuration/interpolation.html
Depends on #12 . Should maybe hang fire until we test a bit more with AWS to avoid having to refactor in 2 places.
Consisting of -
Should include -
We need to be able to monitor the entire stack. Some things we need to monitor -
Could potentially use https://ide.visualops.io
At the moment datacenter is defaulting to "DC1"
It should be whatever AZ that particular node is in (for example "eu-west-1")
Mainly just a lift / shift from here https://www.airpair.com/aws/posts/ntiered-aws-docker-terraform-guide
with addition of installing tunnelblick
Should include
See http://blog.wercker.com/2013/11/27/Slack-Notifications.html
We should push this to an #apollo channel on slack
https://github.com/deverton/terraform-aws-consul as an example
These should test things like -
This is basically mule running in a docker container that we can deploy in the cluster
Something like http://go.docker.com/e/44082/mesos-with-docker-compose-html/3rhy4/317817635
but will added consul + weave
Components we need to document (for now) -
We have no dependency set in mesos-slaves.tf, meaning if the slave node comes up first before a master then it will not connect to the mesos cluster properly.
We need to bring up all the masters first, then join the slaves
We do this on the NAT instance -
sudo iptables -t nat -A POSTROUTING -j MASQUERADE
which is supposed to allow traffic via iptables routing from the private instances to reach the internet via the NAT machine. However this does not work at the moment so we need to fix / find a workaround
Sometimes a serverspec test fails in wercker but the build just carries on and it looks like it passes
On a first run getting this error -
aws_instance.mesos-master.0: Creation complete
Error applying plan:
1 error(s) occurred:
* dial tcp 52.16.231.211:22: i/o timeout
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
The IP address obviously changes run to run with the instances changing.
This is so we can serve DNS requests for the .consul domain via the consul DNS server.
See www.morethanseven.net/2014/04/25/consul/ for more info
Due date for 0.4 to land is this week. We need to resolve any issues there may be with that (I had a few issues while trying to go to terraform master branch)
At the moment to get up and running on AWS we need to
We should be able to create a single wrapper script that chains all of this so we just run a single command, and then end up with some web browser tabs open in a browser after all is completed
Needs to be on each slave
we might want to wait on this hashicorp/terraform#1329
We should be able to just provision a single slave instance, then use amazon auto-scaling to handle provisioning new slave instances dynamically on demand.
See http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-scale-based-on-demand.html
and http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/GettingStartedTutorial.html
We should be able to get to (in a browser) the mesos admin UI via the private_dns name through the vpn server.
We have this in the provisioning -
"sudo iptables -t nat -A POSTROUTING -j MASQUERADE",
but it doesnt appear to be working correctly.
See also http://serverfault.com/questions/637612/trouble-hitting-ec2-amazonprovideddns-server-over-vpn
e.g. we do this
sudo apt-get install -y mesos
which just installs 'latest' from the repo. So at the moment we think we're building a 0.21 box but its actually a 0.22 one (since thats the latest in the repo)
We need a test for ensuring this file exists with the correct content -
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.