nwcdheap / kops-cn Goto Github PK
View Code? Open in Web Editor NEWAWS**宁夏区域/北京区域,快速Kops部署K8S集群
License: Apache License 2.0
AWS**宁夏区域/北京区域,快速Kops部署K8S集群
License: Apache License 2.0
I run the install on my mac.
when I had configured env.config and run the create-cluster.sh, I got the error msg below:
I0130 10:08:37.058539 5466 create_cluster.go:1407] Using SSH public key: /Users/wangqi/.ssh/id_rsa
I0130 10:08:38.840289 5466 subnets.go:184] Assigned CIDR 172.0.32.0/19 to subnet cn-northwest-1a
I0130 10:08:38.840325 5466 subnets.go:184] Assigned CIDR 172.0.64.0/19 to subnet cn-northwest-1b
I0130 10:08:38.840362 5466 subnets.go:184] Assigned CIDR 172.0.96.0/19 to subnet cn-northwest-1c
error determining default DNS zone: error querying zones: RequestError: send request failed
caused by: Get https://route53.cn-northwest-1.amazonaws.com.cn/2013-04-01/hostedzone: dial tcp: lookup route53.cn-northwest-1.amazonaws.com.cn on 172.20.53.163:53: no such host
As AWS ALB Ingress is v1.0.0 now and the ip mode
requires AWS VPC CNI
as the default networking, we should use it as the default networking mode.
--networking amazon-vpc-routed-eni
the new creation script would be like this
kops create cluster \
--cloud=aws \
--name=$cluster_name \
--image=$ami \
--zones=$zones \
--master-count=$master_count \
--master-size=$master_size \
--node-count=$node_count \
--node-size=$node_size \
--vpc=$vpcid \
--networking amazon-vpc-routed-eni \
--kubernetes-version="$kubernetesVersion" \
--ssh-public-key=$ssh_public_key
after finish creating the cluster, when I use ssh
"ssh -i ~/.ssh/id_rsa.pub [email protected]"
to login the master or node, it need to input the password, what's the password?
when trying to use 'helm ls',
got this error "Error: incompatible versions client[v2.11.0] server[v2.9.1]"
https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
- range: ">=1.11.0"
recommendedVersion: 1.11.8
1.11.8
1.11.9
required-images.txt
and display-remote-repos.sh
are in the ./mirror
sub-directory.$ cd ./mirror
$ bash display-remote-repos.sh
You'll immediate get the full ECR repo path.
$ bash display-remote-repos.sh
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kope-dns-controller:1.11.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/k8s-dns-dnsmasq-nanny-amd64:1.14.10
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/k8s-dns-sidecar-amd64:1.14.10
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/k8s-dns-kube-dns-amd64:1.14.10
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/cluster-proportional-autoscaler-amd64:1.1.2-r2
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/coredns:1.1.3
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/coredns:1.2.6
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/etcd:2.2.1
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/pause-amd64:3.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kube-controller-manager:v1.11.6
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kube-scheduler:v1.11.6
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kube-proxy:v1.11.6
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kube-apiserver:v1.11.6
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io-heptio-images-authenticator:v0.3.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/602401143452.dkr.ecr.us-west-2.amazonaws.com-amazon-k8s-cni:v1.3.2
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-coreos-flannel:v0.10.0-amd64
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/ottoyiu-k8s-ec2-srcdst:v0.2.0-3-gc0c26eca
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/ottoyiu-k8s-ec2-srcdst:v0.2.2
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/894847497797.dkr.ecr.us-west-2.amazonaws.com-aws-alb-ingress-controller:v1.1.1
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-node:v3.4.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-cni:v3.4.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-node:v2.6.12
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-cni:v1.11.8
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-kube-controllers:v1.0.5
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-kube-policy-controller:v0.7.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-calico-calico-upgrade:v1.0.5
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/defaultbackend:1.4
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-kubernetes-ingress-controller-nginx-ingress-controller:0.20.0
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/etcd:3.2.18
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/etcd:3.2.24
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/kubernetes-dashboard-amd64:v1.10.1
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/proxy_init:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/proxyv2:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/galley:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/pilot:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/mixer:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/kubectl:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/sidecar_injector:master-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/proxy_init:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/proxyv2:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/galley:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/pilot:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/mixer:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/kubectl:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/sidecar_injector:release-1.0-latest-daily
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io-kubernetes-helm-tiller:v2.12.3
We got the upgrade from Kops stream with Kops 1.11.1
and K8s 1.11.7
https://github.com/kubernetes/kops/releases/tag/1.11.1
https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
- range: ">=1.11.0"
recommendedVersion: 1.11.7
requiredVersion: 1.11.0
required-images.txt
70f5920docker push
e.g.
res=$(aws --profile bjs --region $ECR_REGION ecr describe-images --repository-name "$repo" \
--query "imageDetails[?(@.imageDigest=='$2')].contains(@.imageTags, '$tag') | [0]")
if [ "$res" == "true" ]; then
return 0
else
return 1
fi
If this returns 0
we don't have to push this image to ecr as image already exists.
The CNI version will roll back to 1.3.0 after upgrading to 1.3.3 or 1.4.0
Due to the ca certificate in different location on CentOS, the container k8s-ec2-srcdst is unable to go up because it cannot find the certificate. It is mentioned in kubernetes/kops#4331
Default certificate path: /etc/ssl/certs/ca-certificates.crt
Centos path: /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
One workaround is to run the 'kubectl patch' command mentioned in the issue. Another thought I have is to change the source code of kops and recompile it.
Welcome any other good advice, thank you.
1.10.11
to Chinacluster can't be created if the kops
client has poor internet connection to github
. You can test the connectivity by
curl https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
-v
flag in kops create
kops create cluster \
-v 9 \
--cloud=aws \
...
create-cluster.sh
9801a7a9620b:kops-cn hunhsieh $ bash create-cluster.sh
I0129 01:36:33.093275 18727 create_cluster.go:1407] Using SSH public key: /Users/hunhsieh/.ssh/id_rsa.pub
I0129 01:36:33.093795 18727 factory.go:68] state store s3://pahud-kops-state-store-zhy
I0129 01:36:33.342730 18727 s3context.go:194] found bucket in region "cn-northwest-1"
I0129 01:36:33.342805 18727 s3fs.go:220] Reading file "s3://pahud-kops-state-store-zhy/cluster.zhy.k8s.local/config"
I0129 01:36:33.487717 18727 channel.go:97] resolving "stable" against default channel location "https://raw.githubusercontent.com/kubernetes/kops/master/channels/"
I0129 01:36:33.487769 18727 channel.go:102] Loading channel from "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable"
I0129 01:36:33.489467 18727 context.go:159] Performing HTTP request: GET https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
I0129 01:37:03.492838 18727 context.go:227] retrying after error error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeout
I0129 01:37:03.993923 18727 context.go:159] Performing HTTP request: GET https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
I0129 01:37:33.997853 18727 context.go:227] retrying after error error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeout
I0129 01:37:35.000502 18727 context.go:159] Performing HTTP request: GET https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
I0129 01:38:05.001673 18727 context.go:227] retrying after error error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeout
I0129 01:38:07.002652 18727 context.go:159] Performing HTTP request: GET https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
I0129 01:38:37.009644 18727 context.go:227] retrying after error error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeout
I0129 01:38:41.010692 18727 context.go:159] Performing HTTP request: GET https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
I0129 01:39:11.012203 18727 context.go:231] hit maximum retries 5 with error error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeouterror reading channel "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": error fetching "https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable": Get https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable: dial tcp 151.101.196.133:443: i/o timeout
It may not be the right place to ask this, but just wanted to check if there is a fast way to pull ECR image from another AWS account outside china into aws china.
According to this:
you will be able to pull image from
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/quay.io-coreos-alb-ingress-controller:1.0-beta.7
Please note the version tag may change over time.
After upgrading to 1.11.6, the kube-proxy pod failed to come up since this image is mirrored. (I guess it is the same for many other 1.11.6 image)
I wonder if we can just have automatic script to enumerate all versions of the white listed images and make a mirror of it, so that we don't need to worry about that any more.
目前使用默认的ami会导致机器一直重启,所以我看官方又说**区可以通过拷贝的姿势拿到最新的ami:kope.io/k8s-1.11-debian-stretch-amd64-hvm-ebs-2018-08-17
链接:
https://github.com/kubernetes/kops/blob/master/docs/aws-china.md
kubernetes-retired/kube-aws#390 (comment)
We only have 1.0.0
now. According to the release notes, we may need multiple versions including 1.3.0
, 1.2.1
, 1.2.0
,etc.
dns controller启动失败, kope/dns-controller: 1.11.0 找不到
after finish creating the cluster, all the related services are been created sucessfully, but when I "kops validate cluster", it cann't connect the ELB, the log below:
unexpected error during validation: error listing nodes: Get https://api-cluster-bjs-k8s-local-c9l1qd-2011066806.cn-north-1.elb.amazonaws.com.cn/api/v1/nodes: dial tcp 54.222.209.4:443: i/o timeout
Anyone know the reason?
Thanks
If you try to deploy aws-alb-ingress-controller in bjs&zhy region, it will need acm service. While ACM is not currently supported in ZHY&BJS YET.
The issue link in aws-alb-ingress-controller project is right here:
kubernetes-sigs/aws-load-balancer-controller#439
add a sample for the extra instance group with mixed instance types and purchase options
refer to kubernetes/kops#6215
According to Kops and etcd
roadmap document
https://github.com/kubernetes/kops/blob/af4df08b694e2a1f8814a7b3649060477be67c86/docs/etcd/roadmap.md
etcd3 will eventually become the default cluster version, however, for some reason it still sticks to v2.2
at this moment.
We have a PR trying to get it sorted but at this moment, to align to the Kops upstream, we still stick to v2.2
now.
If you prefer to provision etcd3
as the default cluster, you can update the spec.yml
like this
https://github.com/nwcdlabs/kops-cn/pull/29/files#diff-ce22796966d5547919fe1967f7781563
Hi @jansony1 , feel free to update this issue if you have any other useful insights.
Thanks.
https://github.com/kubernetes/kops/releases/tag/1.10.1
https://raw.githubusercontent.com/kubernetes/kops/master/channels/stable
let's simplify the control with a single Makefile
and some entrypoints like:
make create-cluster
make edit-ig-nodes
make update-cluster
make delete-cluster
In the helm official document, helm repo add
and helm dep update
is required before the helm install
Add istio.io chart repository and point to the daily release:
$ helm repo add istio.io https://storage.googleapis.com/istio-prerelease/daily-build/master-latest-daily/charts
Build the Helm dependencies:
$ helm dep update install/kubernetes/helm/istio
Got ImagePullBackOff errors during install istio
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22m default-scheduler Successfully assigned istio-citadel-5768b899d4-jg226 to ip-172-31-60-71.cn-north-1.compute.internal
Normal SuccessfulMountVolume 22m kubelet, ip-172-31-60-71.cn-north-1.compute.internal MountVolume.SetUp succeeded for volume "istio-citadel-service-account-token-skxg9"
Warning Failed 22m kubelet, ip-172-31-60-71.cn-north-1.compute.internal Failed to pull image "937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:release-1.0-latest-daily": rpc error: code = Unknown desc = Error response from daemon: manifest for 937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:release-1.0-latest-daily not found
Warning Failed 22m kubelet, ip-172-31-60-71.cn-north-1.compute.internal Error: ErrImagePull
Normal BackOff 22m (x2 over 22m) kubelet, ip-172-31-60-71.cn-north-1.compute.internal Back-off pulling image "937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:release-1.0-latest-daily"
Warning Failed 22m (x2 over 22m) kubelet, ip-172-31-60-71.cn-north-1.compute.internal Error: ImagePullBackOff
Normal Pulling 22m (x2 over 22m) kubelet, ip-172-31-60-71.cn-north-1.compute.internal pulling image "937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/gcr.io/istio-release/citadel:release-1.0-latest-daily"
Cant pull 937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/heptio-images-authenticator:v0.3.0
when running aws-iam-authenticator.
Containers:
aws-iam-authenticator:
Container ID:
Image: 937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/heptio-images-authenticator:v0.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
server
--config=/etc/aws-iam-authenticator/config.yaml
--state-dir=/var/aws-iam-authenticator
--generate-kubeconfig=/etc/kubernetes/aws-iam-authenticator/kubeconfig.yaml
I have seen the image in https://github.com/nwcdlabs/kops-cn/blob/master/mirror/required-images.txt#L32.
For CN mainland scenario without ICP recordal, could we have an internal ELB being used to kubectl internally ?
We need to use a customized nodeup for our cluster. When I try to override the default URL with
export NODEUP_URL='https://s3-us-west-2.amazonaws.com/my-bucket/nodeup/linux/amd64/01/23/18/1516747024/nodeup'
It seems hijacked by the fileRepository
setting.
I1214 16:28:21.474161 94429 builder.go:297] error reading hash file "https://s3.cn-north-1.amazonaws.com.cn/kops-bjs/fileRepository/my-bucket/nodeup/linux/amd64/01/23/18/1516747024/nodeup.sha1": file does not exist
you may have not staged your files correctly, please execute kops update cluster using the assets phase
Any suggestions?
I run into following issues when creating the cluster
I0416 09:48:45.708089 3672 create_cluster.go:1407] Using SSH public key: /home/ec2-user/.ssh/id_rsa.pub
error reading cluster configuration "cluster.zhy.k8s.local": error reading s3://liuhongxi-kops-cn/cluster.zhy.k8s.local/config: Unable to list AWS regions: NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, default.
EC2RoleRequestError: no EC2 instance role found
caused by: EC2MetadataError: failed to make EC2Metadata request
caused by:
make: *** [create-cluster] Error 1
Symptom
Deploy kops with Amazon VPC CNI(--networking=amazon-vpc-routed-eni), the daemonset aws-node will be failed due to ImagePullBackOff.
Root Cause
The generated image url of aws-node is invalid:
937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/602401143452.dkr.ecr.us-west-2.amazonaws.com-amazon-k8s-cni:1.0.0
From the yaml template, image url will be generated from parameter "Networking.AmazonVPC.ImageName" or the default image url from us-west-2 ECR.
It works well after changing image url of aws-node to "pahud/amazon-k8s-cni:1.0.0"
Suggested Solution
I am using Private topology and Internal ELB to create a K8S cluster.
kops create cluster
#omit other parameters
--topology=private
--networking=amazon-vpc-routed-eni
--api-loadbalancer-type=internal
After the cluster started, each nodes was assigned with two interfaces. However, I found only one node is in-service behind ELB.
From the working node, the ip route table is:
core@ip-172-20-53-190 ~ $ ip route
default via 172.20.32.1 dev eth0 proto dhcp src 172.20.53.190 metric 1024
default via 172.20.32.1 dev eth1 proto dhcp src 172.20.61.156 metric 1024
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.20.32.0/19 dev eth0 proto kernel scope link src 172.20.53.190
172.20.32.1 dev eth0 proto dhcp scope link src 172.20.53.190 metric 1024
172.20.32.1 dev eth1 proto dhcp scope link src 172.20.61.156 metric 1024
You can see the eth0 is on the top. That explains why you can reach the eth0 IP.
From the other two defunc nodes, the ip route is like:
core@ip-172-20-114-248 ~ $ ip route
default via 172.20.96.1 dev eth1 proto dhcp src 172.20.98.243 metric 1024
default via 172.20.96.1 dev eth0 proto dhcp src 172.20.114.248 metric 1024
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.20.96.0/19 dev eth1 proto kernel scope link src 172.20.98.243
172.20.96.0/19 dev eth0 proto kernel scope link src 172.20.114.248
172.20.96.1 dev eth1 proto dhcp scope link src 172.20.98.243 metric 1024
172.20.96.1 dev eth0 proto dhcp scope link src 172.20.114.248 metric 1024
core@ip-172-20-114-248 ~ $
The entry of eth1 is on the top, thus you can only reach the IP of eth1.
I am not sure if there is something wrong with my creation. Hope someone can help. Thank you.
--Beta
這個範例示範如何在寧夏regon同一個VPC裡面創建兩個kops集群,並且指定不同的cluster_name
1st cluster: cluster1.zhy.k8s.local
2nd cluster: cluster2.zhy.k8s.local
(注意,必須k8s.local
結尾)
在一個VPC裡面準備六個subnet如下,在這個範例我們每個cluster將會用其中三個subnets
準備兩個Makefile分別是cluster1.mk
and cluster2.mk
內容範例:
https://github.com/nwcdlabs/kops-cn/blob/master/samples/multi-clusters-in-shared-vpc/cluster1.mk
https://github.com/nwcdlabs/kops-cn/blob/master/samples/multi-clusters-in-shared-vpc/cluster2.mk
create cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster1.zhy.k8s.local \
make -f cluster1.mk create-cluster
edit cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster1.zhy.k8s.local \
make -f cluster1.mk edit-cluster
將spec.yml內容貼進去
update cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster1.zhy.k8s.local \
make -f cluster1.mk update-cluster
切換context到cluster1
$ kubectl config use-context cluster1.zhy.k8s.local
validate cluster1
$ AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster1.zhy.k8s.local \
make -f cluster1.mk validate-cluster
create cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster2.zhy.k8s.local \
make -f cluster2.mk create-cluster
edit cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster2.zhy.k8s.local \
make -f cluster2.mk edit-cluster
將spec.yml內容貼進去
update cluster
AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster2.zhy.k8s.local \
make -f cluster2.mk update-cluster
切換context到cluster2
$ kubectl config use-context cluster2.zhy.k8s.local
validate cluster2
$ AWS_PROFILE=cn CUSTOM_CLUSTER_NAME=cluster2.zhy.k8s.local \
make -f cluster2.mk validate-cluster
兩個cluster都可以列出所有kube-system
內的Pod,全部都正常Running
kops edit cluster
tried to add:
assets:
containerRegistry: 937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn
fileRepository: https://s3.cn-north-1.amazonaws.com.cn/kops-bjs/fileRepository/
docker:
logDriver: ""
registryMirrors:
- https://registry.docker-cn.com
wq: to save, however ,editor re-opens and getting error:
error populating cluster spec: error building complete spec: options did not converge after 10 iterations
i tried to ignore and keep going to the final step : kops validate cluster
unexpected error during validation: error listing nodes: Get https://api-cluster-zhy-k8s-local-qpbf7n-1465482247.cn-northwest-1.elb.amazonaws.com.cn/api/v1/nodes: EOF
go to aws console, find:
the three master instances out of service.
i checked the security group rules are all right.
i already made icp exception for 80/8080/443 , and i can telnet elb:443.
and i googled it ,find similar issue: kubernetes/kops#5061
how to fix it ?
Customer need to deploy their cluster into existing subnets, while Kops official page is not clear, as you could see here
https://github.com/kubernetes/kops/blob/master/docs/run_in_existing_vpc.md#shared-subnets
Specifically for those parts,
export SUBNET_ID=subnet-12345678 # replace with your subnet id
export SUBNET_CIDR=10.100.0.0/24 # replace with your subnet CIDR
export SUBNET_IDS=$SUBNET_IDS # replace with your comma separated subnet ids
What you really need is to just modify the script like below according to your subnet and zones, and keep other scripts no change in Makefile (PS: Only include necessary parts, )
.PHONY: create-cluster
create-cluster:
@KOPS_STATE_STORE=$(KOPS_STATE_STORE) \
AWS_PROFILE=$(AWS_PROFILE) \
AWS_REGION=$(AWS_REGION) \
AWS_DEFAULT_REGION=$(AWS_DEFAULT_REGION) \
kops create cluster \
--cloud=aws \
--name=$(CLUSTER_NAME) \
--image=$(AMI) \
--master-count=$(MASTER_COUNT) \
--master-size=$(MASTER_SIZE) \
--node-count=$(NODE_COUNT) \
--node-size=$(NODE_SIZE) \
--vpc=$(VPCID) \
--kubernetes-version=$(KUBERNETES_VERSION_URI) \
--networking=amazon-vpc-routed-eni \
--ssh-public-key=$(SSH_PUBLIC_KEY) \
--zones=cn-northwest-1a,cn-northwest-1b \
--subnets=subnet-2cf25a45,subnet-9315d7e8
1.Delete the original zone option
2.And then add last two lines.
3.Your subnet‘s order must comply with your zone’s order
Amazon EKS now supports AWS VPC CNI 1.3
https://docs.aws.amazon.com/en_us/eks/latest/userguide/cni-upgrades.html
Let's make sure kops-cn goes well with CNI 1.3
As some of our customer may edit wrong in "make edit-cluster" step or they need to update their cluster, so they need rolling update options. Also, if they choose use makefile to doing their kops operation, we my better list all operations so that they have no need to maintain another set of
Environment variable.
Here are two my customer requirement, so add it here.
.PHONY: rolling-cluster
rolling-cluster:
@KOPS_STATE_STORE=$(KOPS_STATE_STORE)
AWS_PROFILE=$(AWS_PROFILE)
AWS_REGION=$(AWS_REGION)
AWS_DEFAULT_REGION=$(AWS_DEFAULT_REGION)
kops rolling-update cluster --name $(CLUSTER_NAME) --yes --cloudonly
.PHONY: get-cluster
get-cluster:
@KOPS_STATE_STORE=$(KOPS_STATE_STORE)
AWS_PROFILE=$(AWS_PROFILE)
AWS_REGION=$(AWS_REGION)
AWS_DEFAULT_REGION=$(AWS_DEFAULT_REGION)
kops get cluster --name $(CLUSTER_NAME)
to make sure the OS is more compatible with other components such as
And eliminate potential complexity of maintainence in the future.
cn-north-1
and cn-northwest-1
is compatible with the latest stable kopsenv.config
and set Amazon Linux 2 LTS
as the default AMIhttps://github.com/nwcdlabs/kops-cn/blob/master/doc/aws-alb-ingress_en.md
new image URL should be: 937788672844.dkr.ecr.cn-north-1.amazonaws.com.cn/894847497797.dkr.ecr.us-west-2.amazonaws.com-aws-alb-ingress-controller:v1.0.1
Where do you store $ak_kms_cipherblob
? Not quite familiar with CodeBuild.
when i create cluster and edit then update cluster follow the instruction with 3 master and 1 node in cn-northwest-1. but the node is NotReady status.
if i change the Makefile set the NETWORKING to flannel-vxlan, it is ok.
i guess because the node have muli private ips, if the primary ip is not the first, the node is NotReady status.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.