Git Product home page Git Product logo

kube-deploy's Introduction

kube-deploy

This is a repository of community maintained Kubernetes cluster deployment automations.

Think of this as https://github.com/kubernetes/contrib for deployment automations! Each subdirectory is its own project. It should be a place where people can come see how the community is deploying kubernetes and should allow for faster development iteration compared to developing in the main repository.

kube-deploy's People

Contributors

ajitak avatar bassco avatar cheld avatar chrislovecnm avatar craigtracey avatar derekwaynecarr avatar eparis avatar erictune avatar erlyfall avatar jessicaochen avatar jsleeio avatar justinsb avatar k4leung4 avatar k8s-ci-robot avatar karan avatar kcoronado avatar krousey avatar luxas avatar marekbiskup avatar medinatiger avatar mikedanese avatar mrincompetent avatar p0lyn0mial avatar pipejakob avatar prateekgogia avatar roberthbailey avatar saad-ali avatar uluyol-goog avatar vishh avatar zmerlynn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kube-deploy's Issues

upup: re-enable GCE

We added a fail-fast for GCE, because upup support is lagging behind AWS

We should get GCE back up to par, and then remove the fail-fast

Project Description

I'm confused by the project description. Will addon-manager be pulled from k8s and live here? Are there bigger plans for this project?

upup: specify hosted zone for aws

It should be possible to specify the Hosted Zone on AWS.

Currently it's implicit, e.g MYZONE="test.foo.bar.example.com" will result in *awstasks.DNSZone dnsZone/example.com whereas the Hosted Zone actually is foo.bar.example.com

Run master node failed with docker-multiplenode

root@kube-master:~/kube-deploy/docker-multinode# service docker start
root@kube-master:~/kube-deploy/docker-multinode# ps axu | grep docker
root      6627  0.0  0.5 146936 10576 ?        Ssl  01:59   0:00 docker-containerd -l /var/run/docker-bootstrap/libcontainerd/docker-containerd.sock --runtime docker-runc --start-timeout 2m
root      6937  2.5  1.6 348124 34124 ?        Ssl  02:01   0:00 /usr/bin/docker -s overlay daemon -H fd://
root      6944  0.3  0.4 286204  9868 ?        Ssl  02:01   0:00 docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc --start-timeout 2m
root      6995  0.0  0.0  12956   936 pts/0    S+   02:01   0:00 grep --color=auto docker
root@kube-master:~/kube-deploy/docker-multinode# ./master.sh 
+++ [0612 02:01:36] K8S_VERSION is set to: v1.2.4
+++ [0612 02:01:36] ETCD_VERSION is set to: 2.2.5
+++ [0612 02:01:36] FLANNEL_VERSION is set to: 0.5.5
+++ [0612 02:01:36] FLANNEL_IPMASQ is set to: true
+++ [0612 02:01:36] FLANNEL_NETWORK is set to: 10.1.0.0/16
+++ [0612 02:01:36] FLANNEL_BACKEND is set to: udp
+++ [0612 02:01:36] DNS_DOMAIN is set to: cluster.local
+++ [0612 02:01:36] DNS_SERVER_IP is set to: 10.0.0.10
+++ [0612 02:01:36] RESTART_POLICY is set to: on-failure
+++ [0612 02:01:36] MASTER_IP is set to: 163.172.162.23
+++ [0612 02:01:36] ARCH is set to: amd64
+++ [0612 02:01:36] NET_INTERFACE is set to: eth0
+++ [0612 02:01:36] --------------------------------------------
+++ [0612 02:01:36] Detected OS: ubuntu
+++ [0612 02:01:36] Launching docker bootstrap...
!!! [0612 02:01:55] docker bootstrap failed to start. Exiting...

How does docker-multinode differs from 'production' k8s clusters?

Hi,

We're evaluating k8s internally and, to do so, I've brought up a 2-machines local cluster with kube-deploy docker-multinode scripts.

I've read some k8s developers saying that this deployment option is different (less reliable?) from standard 'production-ready' clusters deployed using (I think) kube-up clusters/ scripts.

Is it so? If yes, what are the main differences among the two deployment options? Is HA covered in kube-up?

Thank you!

upup: error in kube-scheduler log

W0611 00:02:27.020702       7 client_config.go:355] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W0611 00:02:27.020788       7 client_config.go:360] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined

upup: not all dependencies deleted

We don't currently delete:

  • the SSH keypair
  • the IAM role / policy etc (we could maybe use the IAM path to mark these)
  • the DNS zone (that is probably correct)

upup: describe vendoring and 1.5 build options

We should support people building with golang 1.5 (or document a restriction if we can't do that). This may be as simple as adding GO15VENDOREXPERIMENT=1 to the instructions / makefile.

PKI can be created with Terraform

Just for the record, certs can be generated using Terraform. I haven't tested this feature and only learned about it somewhat recently, I would love to hear if anyone has tried it in Kubernetes context or in general. It does seem appropriate for min-turnup and any other solutions using Terraform.

Can we reuse componentconfig for our configuration spec?

I'd like to experiment with using the componentconfig types for our configuration. (I think it would be awesome if we could converge on a single configuration system for the various phases, and the types in k8s itself seem a logical choice)

Is there are way to instruct e.g. kubelet to read configuration from a JSON file containing a componentconfig.KubeletConfiguration? Or to generate flags from an object of the same?

I've poked around but didn't see anything.

cc @mikedanese

upup: prune configuration schema

It is easy to add things later; hard to remove them.

If we aren't actively using something, remove it / comment it out from the configuration schema.

upup: unusual bug when deleting ASG

autoscaling-group:kubernetes.master.eu-west-1a.kubernetes-e2e-upup-aws.awsdata.com      error deleting resource, will retry: error deleting autoscaling group "kubernetes.master.eu-west-1a.kubernetes-e2e-upup-aws.awsdata.com": ValidationError: AutoScalingGroup name not found - AutoScalingGroup 'kubernetes.master.eu-west-1a.kubernetes-e2e-upup-aws.awsdata.com' not found

I guess this is the "already deleted" message.

upup: tracking issue to reach v1

Primary tasks before we can declare v1

P0

  • finalize configuration schema (#44)
  • reach consensus on protokube
  • clean up terraform generation

P1

  • fix tagging edge cases

P2

  • bring up master (& etcd) in ha mode
  • decide if we want to use ELB on single-master
  • store state in S3?
  • encrypt state using KMS?

Proposal for kube-up v1 -> kube-up v2 strategy

Straw-man proposal for how we should get from kube-up v1 to kube-up v2:

  • Create v2 of the create-instances-on-cloud-side (i.e. the bash side of kube-up). We have a growing list of requirements for it. We also have users that would use it today, to do tasks that the current kube-up tool is bad at. (e.g. adding more nodes of a different instance type to an existing cluster) My proposal is that we use something like https://github.com/kopeio/kope/tree/gce We can start getting this tested right away, with 1.2. Some people have already used the kope tool to migrate from 1.1 -> 1.2
  • Create an integration test for upgrade 1.2 -> 1.3 using the above tool (and the other tools we create here...)
  • We should export a configuration from the above tool, to terraform. Terraform seems the most popular such tool. This would demonstrate that while we have our own easy to use tool, we can in fact work with the other tools people want and are not forcing anyone to use our tool.
  • Create the node-side v2 (i.e. the Salt side of kube-up) I think we should use something like #5. I suggest we start with it using the v1.2 configuration file format (kube_env.yaml).
  • Export a configuration from the node-side v2 to cloud-init. Just like we did with the bash-replacement side, this will verify that our salt-replacement is not a lock-in.
  • Define a v2 configuration format, that makes more use of ConfigMap, is more consistent, etc. Change both the front-end and back-end tool to support it, but we need an upgrade story, so likely we'll need to support both formats in 1.3.
  • Use our new tooling to make HA master work. Ideally through k8s-on-k8s, but if we can't get there HA master will still be a good test that our new system is powerful enough.

[min-turnup] kubelet.service can hangnode

edit: see further down for the node hanging issue


I have something close to working with min-turnup, but I'm running into an issue with the playbook.

Where does master_ip get added to the cfg object here?

RuntimeError: RUNTIME ERROR: No such field: master_ip
    std.jsonnet:584:29-55   thunk <val>
    std.jsonnet:589:41-43   thunk <val>
    std.jsonnet:440:30-32   thunk <a>
    std.jsonnet:28:21   
    std.jsonnet:28:12-22    thunk <a>
    std.jsonnet:28:12-34    function <anonymous>
    std.jsonnet:28:12-34    function <anonymous>
    std.jsonnet:440:17-33   function <format_code>
    std.jsonnet:589:29-63   thunk <s>
    std.jsonnet:594:38  thunk <str>
    ...
    std.jsonnet:595:61-68   thunk <v>
    std.jsonnet:595:21-69   function <format_codes_obj>
    std.jsonnet:595:21-69   function <format_codes_obj>
    std.jsonnet:600:13-48   function <anonymous>
    std.jsonnet:134:13-28   function <anonymous>
    /opt/playbooks/roles/node/templates/kubeconfig.jsonnet:16:17-45 object <anonymous>
    /opt/playbooks/roles/node/templates/kubeconfig.jsonnet:(14:16)-(17:7)   object <anonymous>
    /opt/playbooks/roles/node/templates/kubeconfig.jsonnet:(12:16)-(18:5)   thunk <array_element>
    /opt/playbooks/roles/node/templates/kubeconfig.jsonnet:(12:15)-(18:6)   object <anonymous>
    During manifestation    

upup: issue pulling glog dependency (with glide)

Issue reported to me:

# I think strip-vendor is the workaround for 25572
glide install --strip-vendor --strip-vcs
[INFO] Downloading dependencies. Please wait...
[INFO] Fetching updates for github.com/aws/aws-sdk-go.
[INFO] Fetching updates for github.com/BurntSushi/toml.
[INFO] Fetching updates for github.com/cloudfoundry-incubator/candiedyaml.
[INFO] Fetching updates for github.com/davecgh/go-spew.
[INFO] Fetching updates for github.com/ghodss/yaml.
[INFO] Fetching updates for github.com/inconshreveable/mousetrap.
[INFO] Fetching updates for github.com/mitchellh/mapstructure.
[INFO] Fetching updates for github.com/spf13/cobra.
[INFO] Fetching updates for github.com/spf13/cast.
[INFO] Fetching updates for github.com/spf13/pflag.
[INFO] Fetching updates for github.com/spf13/jwalterweatherman.
[INFO] Fetching updates for golang.org/x/crypto.
[INFO] Fetching updates for github.com/golang/protobuf.
[INFO] Fetching updates for github.com/hashicorp/hcl.
[INFO] Fetching updates for github.com/magiconair/properties.
[INFO] Fetching updates for github.com/jmespath/go-jmespath.
[INFO] Fetching updates for github.com/golang/glog.
[INFO] Fetching updates for github.com/spf13/viper.
[INFO] Fetching updates for github.com/fsnotify/fsnotify.
[INFO] Fetching updates for github.com/go-ini/ini.
[INFO] Fetching updates for golang.org/x/net.
[INFO] Fetching updates for golang.org/x/oauth2.
[INFO] Fetching updates for golang.org/x/sys.
[INFO] Fetching updates for google.golang.org/api.
[INFO] Fetching updates for google.golang.org/appengine.
[INFO] Fetching updates for google.golang.org/cloud.
[INFO] Fetching updates for google.golang.org/grpc.
[INFO] Fetching updates for gopkg.in/yaml.v2.
[INFO] Fetching updates for k8s.io/kubernetes.
[WARN] Unable to checkout google.golang.org/api
[ERROR] Update failed for google.golang.org/api: Cloning into 'XXX/go/src/k8s.io/kube-deploy/upup/vendor/google.golang.org/api'...
error: RPC failed; HTTP 502 curl 22 The requested URL returned error: 502 Bad Gateway
fatal: The remote end hung up unexpectedly
: exit status 128
[INFO] Downloading dependencies. Please wait...
[INFO] Setting references.
[INFO] Setting version for golang.org/x/crypto to 77f4136a99ffb5ecdbdd0226bd5cb146cf56bc0e.
[INFO] Setting version for github.com/spf13/jwalterweatherman to 33c24e77fb80341fe7130ee7c594256ff08ccc46.
[INFO] Setting version for github.com/inconshreveable/mousetrap to 76626ae9c91c4f2a10f34cad8ce83ea42c93bb75.
[INFO] Setting version for github.com/BurntSushi/toml to f0aeabca5a127c4078abb8c8d64298b147264b55.
...
[INFO] Setting version for github.com/spf13/pflag to cb88ea77998c3f024757528e3305022ab50b43be.
[INFO] Setting version for github.com/magiconair/properties to c265cfa48dda6474e208715ca93e987829f572f8.
[ERROR] Failed to set version on google.golang.org/api to 63ade871fd3aec1225809d496e81ec91ab76ea29: open XXX/go/src/k8s.io/kube-deploy/upup/vendor/google.golang.org/api: no such file or directory
[INFO] Setting version for github.com/spf13/cobra to 1238ba19d24b0b9ceee2094e1cb31947d45c3e86.
...
An Error has occurred
make: *** [godeps] Error 2

Retrying did work.

upup: create DNS hosted zone in 'proto' phase?

This would likely make for a better terraform experience, because the lifespan of a DNS zone is likely longer than a particular cluster (not least because users need to reconfigure their DNS hosts)

Or we could maybe mark it as ignored by terraform.

upup: encrypt secrets

Support encryption for keystore keys and secretstore secrets.

Unclear whether this should be at the VFS level or not.

upup: We should figure out what to do with instance storage / root disks / btrfs

We've had a number of problems with ephemeral storage on EC2, not least that newer instance types don't include them (e.g. kubernetes/kubernetes#23787). Also symlinking /mnt/ephemeral seems to confuse the garbage collector.

We should figure out how to ensure that we have a big enough root disk, maybe how to re-enable btrfs, and then if there is anything we can do with the instance storage if we're otherwise not going to use it (maybe hostVolumes? Or some sort of caching service?)

upup: terraform issue #2143 prevents creation of EC2 tags with dots

Terraform issue 2143 prevents creation of EC2 tags with dots, which we use for k8s.io/... It actually works for autoscaling groups; the only place it causes us trouble is with volumes.

Going to investigate a workaround where we create the volumes outside of terraform anyway, as you might not want terraform blindly deleting the crucial state of your cluster!

upup: Secret store / CA store not thread safe

I think I saw an error which suggested that the secret store served an invalid JSON from the secret. This would happen if we were mid-write during a read, I believe. We should write files atomically (anyway), and we should probably do some locking, particularly during create-if-not-exists operations.

upup: support running etcd and apiserver on independent hosts

Hello,

being able to run the etcd cluster independently from the apiservers would allow:

  • managing the etcd cluster seperately
  • performance isolation/scaling the api layer independently
  • integrating with existing etcd setups

All other servers would then run etcd in proxy mode.

Contribute kube-aws tool to kube-deploy repository

Kube-aws is what we use at CoreOS to deploy production Kubernetes clusters on AWS. It offers a number of features that make it well-suited for a production environment:

  • reproducible, version-controllable deployment artifacts
  • end-to-end encryption of TLS assets
  • compresses and inlines all build artifacts- no reliance on hosting artifacts in public S3 buckets
  • entire deployment is a cloudformation stack, integrates well with new cloudformation update features
    • building on this, we're working towards in-place cluster upgrades

We'd like to start the discussion on moving development of this tool over to the kube-deploy repository.

\cc @bgrant0607 @philips @justinsb

protokube: synchronize changes automatically from statestore

upup includes a state store, currently S3 backed, but we can easily add GCS. We could put e.g. addons into it, so that we can dynamically reconfigure them without any SSH tricks.

This could also be used to dynamically populate the initial manifests (e.g. apiserver, kcm, scheduler etc). Maybe we could also include kubelet, in which case nodeup becomes even smaller.

upup: race condition with kube-addons

Sometimes, typically when something else goes wrong, kube-addons will be stuck in a loop:

Jun 10 19:31:59 ip-172-20-22-48 kube-addons.sh[718]: Error from server: serviceaccounts "default" not found
Jun 10 19:32:00 ip-172-20-22-48 kube-addons.sh[718]: Error from server: serviceaccounts "default" not found
Jun 10 19:32:00 ip-172-20-22-48 kube-addons.sh[718]: Error from server: serviceaccounts "default" not found

This blocks cluster bring-up entirely. KCM won't allocate PodCIDRs etc.

Restarting kube-addons causes it to recover and the cluster starts normally.

sharing infrastruture between deployments

There is a certain amount of tooling that will benefit all deployment automations. We should discuss how to develop this tools in a way that they benefit as many of the maintained deployments as possible. Some work that I can think of that would benefit all deployments:

  • Config revamp. It's hard to configure kubernetes components.
  • Own the installation. Get kubelet and it's dependencies installed.
  • Standardize on a node container image #34
  • (or) build packagemanager packages (rpms and debs)
  • commit to single binary
  • Make pod network easy to deploy. It's hard to setup the pod network.
    • Allow CNI node agents to run in DaemonSets on the host and configure the kubelet
  • Revamp the addon manager. There is no standard way to deploy addons.
    • Move addons out of kubernetes core repo
    • Get rid of the bash addon manager, possibly replace with helm
    • Implement kubectl apply --prune kubernetes/kubernetes#19805
  • Initial client bootstrap. It's hard to setup secure communication between k8s components.
  • Self hosting. Make it easy to run kubernetes on kubernetes.
  • Kubeconfig v2

Obviously, some of these have higher relative priority than others. Let's use this issue to track the deployment shared infrastructure effort. Let me know if there are items on this list that are missing.

cc @luxas @errordeveloper @justinsb

upup: validate & document upgrade procedure

We have an early upgrade procedure. We should validate it for 1.3, including:

  • pods with mounted volumes
  • services with load balancers

And then document the final procedure.

upup: fix tagging edge cases

There are some edge cases where items in AWS won't be tagged, primarily when there is no change other than tags.

We should split out tagging, and also check if we can somehow avoid failures when we exit in between resource creation & tagging

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.