Cluster API Provider for Nested Clusters

License: Apache License 2.0

Makefile 1.40% Python 0.64% Go 93.54% Shell 4.22% Dockerfile 0.19%

cluster-api-provider-nested's Introduction

Kubernetes Cluster API Provider Nested

Cluster API Provider for Nested Clusters

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

Slack
Mailing List
Join our Cluster API Provider Nested working group sessions
- Weekly on Tuesdays @ 10:00 PT
- Previous meetings: notes

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

cluster-api-provider-nested's People

Contributors

Stargazers

Watchers

cluster-api-provider-nested's Issues

🐛 ClusterVersion CRD gets rejected on k8 v1.20.2

The ClusterVersion CRD is using apiVersion v1beta1 which requires properties in x-kubernetes-list-map-keys to be required (validation was introduced with kubernetes/kubernetes@2e18741). v1 supports defaults too.

A manual rebuild with a newer controller-gen (I've used controller-tools 0.5.0) using
controller-gen "crd:trivialVersions=true,maxDescLen=0" rbac:roleName=manager-role paths="./..." output:crd:artifacts:config=config/crds
creates an accepted CRD with the v1 CRD API.

Unfortunately, the CRD generated from make build is broken (contains an invalid validation: section) with newer controller-tools installed. I guess the reason is replace-null in hack/lib/util.sh.

📖 Update Resources Design Document to use PlantUML

Vince had pointed us to that CAPI providers typically have a make target to help create diagrams that are managed in the repo, this issue is to go back and make the control plane resources using PlantUML #7 (comment)

🌱 Release v0.1.0

This is an overarching issue to make sure we track whatever is left to accomplish for releasing CAPN.

❓ Investigate v1.20 kubelet Redirection API Deprecation

Redirection was removed which breaks the vn-agent when we upgrade, we should investigate if our APIs still function as they should or if we need to implement anything "new".

🐛 Incorrect Manifest Path for Releases

Logs from Prow release Job.

Step #0: make set-manifest-image MANIFEST_IMG=gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller MANIFEST_TAG=v20210608-e593785 TARGET_RESOURCE="./config/manager/manager_image_patch.yaml"
Step #0: make[3]: Entering directory '/workspace'
Step #0: fatal: not a git repository (or any parent up to mount point /)
Step #0: Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Step #0: Updating kustomize image patch file for manager resource
Step #0: sed -i'' -e 's@image: .*@image: '"gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller:v20210608-e593785"'@' ./config/manager/manager_image_patch.yaml
Step #0: make[3]: Leaving directory '/workspace'
Step #0: make[2]: Leaving directory '/workspace'
Step #0: sed: ./config/manager/manager_image_patch.yaml: No such file or directory
Step #0: make[3]: *** [Makefile:300: set-manifest-image] Error 1
Step #0: make[2]: *** [Makefile:289: docker-push-core-manifest] Error 2
Step #0: make[1]: *** [Makefile:268: docker-push-all] Error 2
Step #0: make[1]: Leaving directory '/workspace'
Step #0: make: *** [Makefile:346: release-staging] Error 2
Finished Step #0
ERROR
ERROR: build step 0 "gcr.io/k8s-testimages/gcb-docker-gcloud:v20200619-68869a4" failed: step exited with non-zero status: 2

⚠️ Change default branch to "main"

Instructions here https://www.kubernetes.dev/resources/rename/

Anytime

If a presubmit or postsubmit prowjob triggers on the master branch (branches field of the prowjob), add the main branch to the list (see kubernetes/test-infra#20665 for an example).
If the milestone_applier prow config references the master branch, add the main branch to the config (see kubernetes/test-infra#20675 for an example).
If the branch_protection prow config references the master branch, add the main branch to the config.

Just before rename

periodic prowjobs, or any prowjob that mentions the master
If a prowjob mentions master in its name, rename the job to not include the branch name
If a prowjob calls scripts or code in your repo that explicitly reference master

Finalizing

Set remote head to track main
Rename the default branch from master to main
#66 fix apidiff branch name

Post-rename

If a prowjob still references the master branch in the branches field, remove the master branch
- kubernetes/test-infra#22309
~~If the milestone_applier prow config references the master branch, remove it from the config.~~
Send out comms

/kind cleanup
/wg naming

Add Office Hours Google doc to Readme

Link should be - https://docs.google.com/document/d/10aTeq2lhXW_3aFQAd_MdGjY8PtZPslKhZCCcXxFp3_Q/edit#

Add design docs to kubernetes/enhancements instead of this repo

Should we consider adding them to k/e to go through the KEP process and have a single source of truth instead of this repo?

/cc @christopherhein @munnerz

📖 Update NestedComponent Controller Docs with new PKI Strategy

This issue is tracking the work to update the original designs for NKAS NEtcd and NKCM based on the changes from #29 (review)

High Level CAPN Goals

Prelim Goals:

To configure new resource types for nested pod based control planes
To enable declarative orchestrated control plane upgrades (minus etcd)
To enable independent scaling of each of the control plane components
To enable configuring the control plane components
To enable the use of custom implementations for all components

Prelim Non-Goals:

To manage etcd, it's expected that etcd can be run separately with an integration point
To provide CNI configuration this is left to the management or super cluster

Add Milestone test infra plugin

High Level User Stories

As a control plane operator, I want my Kubernetes control plane to have multiple replicas to meet my zero down time needs.
As a control plane operator, I want to be able declare how each control plane is configured.
As a control plane operator, I want to be able to rotate the certificates on all components so that my cluster continues to run
As a developer, I want to be able to use control plane level resources without conflicting with another control plane
As a control plane operator, I want to be able to upgrade to a minor version so my cluster remains supported.
As a control plane operator, I want to be able to know my cluster is working properly after it's been created.

⚠️ Move controlplane.cluster.x-k8s.io components to controlplane/

When we look at other CAPI implementations instead of using multi-group controllers they typically are deployed as independent controllers, they're even built this way, for example see kcp in controlplane/kubeadm/ in https://github.com/kubernetes-sigs/cluster-ap while the main types NestedCluster in our world live usually at the top level, eg https://github.com/kubernetes-sigs/cluster-api-provider-aws you can even see the eks controlplane is built in controlplane/eks/ https://github.com/kubernetes-sigs/cluster-api-provider-aws/tree/main/controlplane/eks.

We should adjust to having the 4 controlplane group's resources in controlplane/nested/ and have only the top level NestedCluster at / this will also make it easier to co-operate if we eventually transition to kcp (eg #44)

🌱 Add Release Prow Jobs

Prior to the first version being released we need to add to the scripts/ dir the proper release scripts as well as setup Prow release jobs. For now we should have just CAPN covered, we'll eventually create separate jobs for virtualcluster releases.

TODOS:

kubernetes/k8s.io#2123 Request GCP Project
Setup Makefile
Add Image build prow jobs

⚠️ Upgrade Controller Runtime in VirtualCluster

This issue is to track upgrading from v0.6.1 controller runtime to v0.7.2+. This brings on a lot of changes to Controller Runtime and is closer to the release we use for the rest of CAPN.

This will start the move towards more consistent usage of contexts throughout all the reconcilers and syncer and align us for easier integration.

Create NestedControllerManager Component

This should use the design from #10

Write a tutorial for new CAPN users

We need to provide a document to describe how to launch a pod base control plane from scratch.

Create controller for NestedAPIServer

This should use the design #10

✨ Clean up Prow referenced for old branch name

There are three left over references to master in the presubmit, we need to clean those up to fully switch to main.

/kind cleanup
/milestone v0.1.x

Ref #47

🐛 Post Submit Image Builds

All Post Submit Image builds are failing because of the wrong GCP project being specified. We need to update the Makefile to point to the correct registry URL w/ project name.

Create Controller and types for NestedControllerManager

#29

🌱 Support CAPI ClusterResourceSet

many end users are now using operator to manage their components, it would be great for the tenant cluster enable OLM by default, then all of the tenant cluster can manage operators by default.

Hope we can leverage the cluster api to deploy some addon components like olm etc.

This is a requirement in kubernetes-retired/multi-tenancy#1475

@Fei-Guo ^^

🐛 Unable to create a VirtualCluster on k8 v1.20.2

Problem

Virtual cluster does not deploy with k8 v1.20.2. Output from vc-manager:

{"level":"info","ts":1621457612.1333222,"logger":"clusterversion-controller","msg":"reconciling ClusterVersion..."}
{"level":"info","ts":1621457612.1334903,"logger":"clusterversion-controller","msg":"new ClusterVersion event","ClusterVersionName":"cv-sample-np"}
{"level":"info","ts":1621457635.4177175,"logger":"virtualcluster-webhook","msg":"validate create","vc-name":"vc-sample-1"}
{"level":"info","ts":1621457635.4421399,"logger":"virtualcluster-controller","msg":"reconciling VirtualCluster..."}
{"level":"info","ts":1621457635.4824774,"logger":"virtualcluster-webhook","msg":"validate update","vc-name":"vc-sample-1"}
{"level":"info","ts":1621457635.511791,"logger":"virtualcluster-controller","msg":"a finalizer has been registered for the VirtualCluster CRD","finalizer":"virtualcluster.finalizer.native"}
{"level":"info","ts":1621457635.5118568,"logger":"virtualcluster-controller","msg":"will create a VirtualCluster","vc":"vc-sample-1"}
{"level":"info","ts":1621457635.53576,"logger":"virtualcluster-webhook","msg":"validate update","vc-name":"vc-sample-1"}
{"level":"info","ts":1621457635.556264,"logger":"virtualcluster-controller","msg":"reconciling VirtualCluster..."}
{"level":"info","ts":1621457635.5563915,"logger":"virtualcluster-controller","msg":"VirtualCluster is pending","vc":"vc-sample-1"}
{"level":"info","ts":1621457638.3632772,"logger":"virtualcluster-controller","msg":"creating secret","name":"root-ca","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.400915,"logger":"virtualcluster-controller","msg":"creating secret","name":"apiserver-ca","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.4276915,"logger":"virtualcluster-controller","msg":"creating secret","name":"etcd-ca","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.4523375,"logger":"virtualcluster-controller","msg":"creating secret","name":"controller-manager-kubeconfig","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.485505,"logger":"virtualcluster-controller","msg":"creating secret","name":"admin-kubeconfig","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.5329306,"logger":"virtualcluster-controller","msg":"creating secret","name":"serviceaccount-rsa","namespace":"default-a4a766-vc-sample-1"}
{"level":"info","ts":1621457638.562718,"logger":"virtualcluster-controller","msg":"deploying StatefulSet for master component","component":""}
{"level":"error","ts":1621457638.5628488,"logger":"virtualcluster-controller","msg":"fail to create virtualcluster","vc":"vc-sample-1","retrytimes":3,"error":"try to deploy unknwon component: "}
{"level":"info","ts":1621457638.5843189,"logger":"virtualcluster-webhook","msg":"validate update","vc-name":"vc-sample-1"}
{"level":"info","ts":1621457638.6019728,"logger":"virtualcluster-controller","msg":"reconciling VirtualCluster..."}
{"level":"info","ts":1621457638.6020927,"logger":"virtualcluster-controller","msg":"VirtualCluster is pending","vc":"vc-sample-1"}

The namespace and secrets were created but none of the statefulsets from the ClusterVersion.

What I did

git clone https://github.com/kubernetes-sigs/cluster-api-provider-nested.git
cd cluster-api-provider-nested/virtualcluster

Build kubectl-vc

make build WHAT=cmd/kubectl-vc
sudo cp -f _output/bin/kubectl-vc /usr/local/bin

Create new CRDs

(see #62)

cd pkg
controller-gen "crd:trivialVersions=true,maxDescLen=0" rbac:roleName=manager-role paths="./..." output:crd:artifacts:config=config/crds

Install CRD

kubectl create -f config/crds/cluster.x-k8s.io_clusters.yaml
kubectl create -f config/crds/tenancy.x-k8s.io_clusterversions.yaml
kubectl create -f config/crds/tenancy.x-k8s.io_virtualclusters.yaml

Create ns, rbac, deployment, ...

kubectl create -f config/setup/all_in_one.yaml

I've added events to the RBAC because of this:

{"level":"info","ts":1621388803.9796872,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"clusterversion-controller","source":"kind source: /, Kind="}
E0519 01:46:43.981421 1 event.go:260] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"vc-manager-leaderelection-lock.16805486d7f96288", GenerateName:"", Namespace:"vc-manager", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"ConfigMap", Namespace:"vc-manager", Name:"vc-manager-leaderelection-lock", UID:"5c94eb36-66a2-437a-a10f-6fc651533e96", APIVersion:"v1", ResourceVersion:"96800211", FieldPath:""}, Reason:"LeaderElection", Message:"vc-manager-76c5878465-6tq8f_e49ead0e-85c4-43f6-bb44-e4f0820e8ee8 became leader", Source:v1.EventSource{Component:"vc-manager-76c5878465-6tq8f_e49ead0e-85c4-43f6-bb44-e4f0820e8ee8", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc0213960fa5d0488, ext:18231381017, loc:(*time.Location)(0x23049a0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc0213960fa5d0488, ext:18231381017, loc:(*time.Location)(0x23049a0)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:serviceaccount:vc-manager:vc-manager" cannot create resource "events" in API group "" in the namespace "vc-manager"' (will not retry!)

Create a new ClusterVersion

kubectl create -f config/sampleswithspec/clusterversion_v1_nodeport.yaml

Had to remove kind and apiVersion below controllerManager: to match the schema:

error: error validating "cv-sample-nb.yaml": error validating data: [ValidationError(ClusterVersion.spec.controllerManager): unknown field "apiVersion" in io.x-k8s.tenancy.v1alpha1.ClusterVersion.spec.controllerManager, ValidationError(ClusterVersion.spec.controllerManager): unknown field "kind" in io.x-k8s.tenancy.v1alpha1.ClusterVersion.spec.controllerManager]; if you choose to ignore these errors, turn validation off with --validate=false

Create a new VirtualCluster

kubectl vc create -f config/sampleswithspec/virtualcluster_1_nodeport.yaml -o vc.kubeconfig

Proposal for ControlPlane Custom Resources

This should cover how the all the Nested* types work together.

Create NestedControlPlane controller and types

#29

🌱 Setup Prow Jobs for virtualcluster

We should have presubmit jobs to make sure that ex - kubernetes/test-infra#22090

Change flag names in `main.go`

To help align the flags in this provider with Kubernetes component standards we should change the flag names https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/master/main.go#L66-L79

	fs.StringVar(&metricsAddr, "metrics-bind-address", ":8080",
		"The address the metric endpoint binds to.")

	fs.BoolVar(&enableLeaderElection, "leader-elect", false,
		"Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")

	fs.DurationVar(&leaderElectionLeaseDuration, "leader-elect-lease-duration", 15*time.Second,
		"Interval at which non-leader candidates will wait to force acquire leadership (duration string)")

	fs.DurationVar(&leaderElectionRenewDeadline, "leader-elect-renew-deadline", 10*time.Second,
		"Duration that the leading controller manager will retry refreshing leadership before giving up (duration string)")

	fs.DurationVar(&leaderElectionRetryPeriod, "leader-elect-retry-period", 2*time.Second,
		"Duration the LeaderElector clients should wait between tries of actions (duration string)")

🐛 Image Building has references to docker/infra

Cloud Build Logs:

Already have image (with digest): gcr.io/k8s-testimages/gcb-docker-gcloud:v20200619-68869a4
make: *** No rule to make target 'release-staging'.  Stop.

Need to remove https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/cloudbuild.yaml#L16-L25

Test fails with race detected

How to reproduce?

Sync to ToT and run make

=== RUN   TestReconcile
==================
WARNING: DATA RACE
Write at 0x00c0001146a8 by goroutine 41:
  internal/race.Write()
      /usr/local/go/src/internal/race/race.go:41 +0x114
  sync.(*WaitGroup).Wait()
      /usr/local/go/src/sync/waitgroup.go:128 +0x115
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).waitForRunnableToEnd.func2()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:567 +0x40

Previous read at 0x00c0001146a8 by goroutine 173:
  internal/race.Read()
      /usr/local/go/src/internal/race/race.go:37 +0x1e8
  sync.(*WaitGroup).Add()
      /usr/local/go/src/sync/waitgroup.go:71 +0x1fb
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:678 +0x4e
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).serveMetrics()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:384 +0x318

Goroutine 41 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).waitForRunnableToEnd()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:566 +0xc6
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:548 +0x370
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start.func1()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:449 +0x49
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:499 +0x573
  sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/vcmanager.(*VirtualClusterManager).Start()
      <autogenerated>:1 +0x7d
  sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/virtualcluster.StartTestManager.func1()
      /Users/f.guo/go/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/virtualcluster/virtualcluster_controller_suite_test.go:73 +0xb0

Goroutine 173 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start()
      /Users/f.guo/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:473 +0x5d4
  sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/vcmanager.(*VirtualClusterManager).Start()
      <autogenerated>:1 +0x7d
  sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/virtualcluster.StartTestManager.func1()
      /Users/f.guo/go/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/virtualcluster/virtualcluster_controller_suite_test.go:73 +0xb0
==================
    TestReconcile: testing.go:906: race detected during execution of test
--- FAIL: TestReconcile (0.16s)
    : testing.go:906: race detected during execution of test
FAIL
coverage: 10.7% of statements
FAIL	sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/controller/virtualcluster	11.843s

it can be 100% reproduced in my local machine.

🐛 Clean up `master` references for APIDiff

Aaron caught one change that will effect any submitted PR for APIDiff checking, where it used master, once we change the default branch we need to update this reference.

Ref #47

Fix golang-ci lint configuration exclude configuration

Every finding which contains "rt", "tt", "test", .. is excluded.

For more context:

Create Controller for NestedEtcd implementation

This should use the designs documented #10

🌱 Setup Prow Presubmits

We should have our PRs being tested for us with Prow on PR submission so we can have better assurance of the code.

Links

Our pre-submit should run make generate lint test. This should be called from a script like other Cluster API providers so that we can periodically change what it does without needing to update the test-infra repo. EG https://github.com/kubernetes-sigs/cluster-api-provider-aws/tree/main/scripts

🌱 Add Periodic Prow Test jobs

Since we've seen a bit of instability in the test suite we should get Prow to run periodic scripts on a 4h interval like most other CAPI repos have.

Create Controller for NestedCluster

#29

Create base scaffolding code for all three components

This should use the design #10

make docker-build failed

$ make docker-build
fatal: No names found, cannot describe anything.
docker pull docker.io/docker/dockerfile:experimental
experimental: Pulling from docker/dockerfile
d7f0373ffb1d: Pull complete
Digest: sha256:600e5c62eedff338b3f7a0850beb7c05866e0ef27b2d2e8c02aa468e78496ff5
Status: Downloaded newer image for docker/dockerfile:experimental
docker.io/docker/dockerfile:experimental
docker pull docker.io/library/golang:1.15.3
1.15.3: Pulling from library/golang
e4c3d3e4f7b0: Pull complete
101c41d0463b: Pull complete
8275efcd805f: Pull complete
751620502a7a: Pull complete
aaabf962c4fc: Pull complete
7883babec904: Pull complete
1791d366c848: Pull complete
Digest: sha256:1ba0da74b20aad52b091877b0e0ece503c563f39e37aa6b0e46777c4d820a2ae
Status: Downloaded newer image for golang:1.15.3
docker.io/library/golang:1.15.3
docker pull gcr.io/distroless/static:latest
latest: Pulling from distroless/static
5dea5ec2316d: Pull complete
Digest: sha256:60a7d0c45932b6152b2f7ba561db2f91f58ab14aa90b895c58f72062c768fd77
Status: Downloaded newer image for gcr.io/distroless/static:latest
gcr.io/distroless/static:latest
bash: gcloud: command not found
bash: gcloud: command not found
DOCKER_BUILDKIT=1 docker build --build-arg goproxy=https://proxy.golang.org,direct --build-arg ARCH=amd64 --build-arg ldflags="" . -t gcr.io//cluster-api-nested-controller-amd64:dev
invalid argument "gcr.io//cluster-api-nested-controller-amd64:dev" for "-t, --tag" flag: invalid reference format

🐛 make release-alias-tag fails on prow image-pushing

Step #0: make[3]: Leaving directory '/workspace'
Step #0: make[2]: Leaving directory '/workspace'
Step #0: gcloud container images add-tag gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller:v20210608-ea129c6 gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller:main
Step #0: Created [gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller:main].
Step #0: Updated [gcr.io/k8s-staging-cluster-api-nested/cluster-api-nested-controller:v20210608-ea129c6].
Step #0: gcloud container images add-tag gcr.io/k8s-staging-cluster-api-nested/nested-controlplane-controller:v20210608-ea129c6 gcr.io/k8s-staging-cluster-api-nested/nested-controlplane-controller:main
Step #0: ERROR: Error during upload of: gcr.io/k8s-staging-cluster-api-nested/nested-controlplane-controller:main
Step #0: ERROR: (gcloud.container.images.add-tag) Not found: response: {'status': '404', 'content-length': '202', 'x-xss-protection': '0', 'transfer-encoding': 'chunked', 'server': 'Docker Registry', '-content-encoding': 'gzip', 'docker-distribution-api-version': 'registry/2.0', 'cache-control': 'private', 'date': 'Tue, 08 Jun 2021 18:57:01 GMT', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json'}
Step #0: Failed to fetch "v20210608-ea129c6" from request "/v2/k8s-staging-cluster-api-nested/nested-controlplane-controller/manifests/v20210608-ea129c6".: <no details provided>
Step #0: make[1]: *** [Makefile:353: release-alias-tag] Error 1
Step #0: make[1]: Leaving directory '/workspace'
Step #0: make: *** [Makefile:346: release-staging] Error 2

🌱 Integrate VC and CAPN

🌱 Add Presubmit Runs for release-* branches

Right now in #98 and other cherrypick branches we don't run the presubmit jobs and we should just to validate they aren't broken.

🌱 Investigate using KubeadmControlPlane (KCP) for NestedCluster

TL;DR

This is a spike to investigate what it would be like to use KCP for implementing CAPN. https://github.com/kubernetes-sigs/cluster-api/tree/master/controlplane/kubeadm KCP uses Kubeadm's ClusterConfiguration object and is what nearly all Cluster API providers use, this was originally not gone with because it only supports cloudinit based deployments.

Background:

KCP needs to map to Machines, Machines have to map to actual Kubernetes Nodes and the control plane pods need to show up on those specific. KCP also needs to be able to exec from the management cluster into the workload cluster to get access to etcd for health checks. KCP also doesn't manage client certificates, these are handled on each node by kubeadm that is called through cloudinit. We should look at the in-tree docker provider for some inspiration on how we could use the kcp outputs to create Pod based control planes within custom Machine controllers. https://github.com/kubernetes-sigs/cluster-api/tree/master/test/infrastructure/docker

Consider how to handle shared instance with VC

This issue is migrate from kubernetes-retired/multi-tenancy#1502

Issue description

With VC, all of the components installed in tenant cluster will be isolated, and the tenant cluster will have all resources for a specified application.

Here the question is there are some apps, which has a shared component, and the component will be shared by many apps. So with such apps, I was hoping I can install the shared component in the super cluster, but install other components for the app into tenant clusters, and I want all of the tenant cluster app can access the super cluster shared component, any comments for how can I achieve this?

I think with this model, I can also reduce the footprint for the supercluster as well, as I can abstract some common services and deploy them into the super cluster, and share it with all tenant clusters.

Comments from @christopherhein

Maybe we can take this question over to https://sigs.k8s.io/cluster-api-provider-nested.

Short answer is we do a bit of this, but it's slightly different we allow nested(virtual) clusters to operate on "real" super cluster Service clusterIPs so that we can have routable clusterIP ranges, this is done via a mutating admission webhook which acts as a proxy to the super cluster, we also have custom syncers written using this model - https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/doc/customresource-syncer.md for CRDs that we expose only the implementation at the super cluster but want tenant clusters to be able to CRUD them.

High level base Architecture for CAPN

We need to create a new proposal document from /docs/template.md which outlines the goals for the CAPN architecture.

🌱 Setup CAPN to support clusterctl

There are multiple manifests necessary to satify the clusterctl provider contract. We need to make sure CAPN has these files so users can:

clusterctl init \
   --core cluster-api:v0.4.0-beta.0 \
   --control-plane nested:v0.1.0 \
   --infrastructure nested:v0.1.0

Links

https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html

[Quick Start] Failed to connect to cluster

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl --kubeconfig kubeconfig get all -A
Unable to connect to the server: dial tcp: lookup cluster-sample-apiserver on 1.1.1.2:53: server misbehaving

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl get svc
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
cluster-sample-apiserver   ClusterIP   10.96.206.199   <none>        6443/TCP   5m7s
cluster-sample-etcd        ClusterIP   None            <none>        <none>     5m9s
kubernetes                 ClusterIP   10.96.0.1       <none>        443/TCP    24m

@charleszheng44 ^^

Move images from docker hub to quay

Now all of the images for VC are hosted in docker hub.

The following cmd will create a ClusterVersion named cv-sample-np, which specifies the tenant master components as:

etcd: a StatefulSet with virtualcluster/etcd-v3.4.0 image, 1 replica;
apiServer: a StatefulSet with virtualcluster/apiserver-v1.16.2 image, 1 replica;
controllerManager: a StatefulSet with virtualcluster/controller-manager-v1.16.2 image, 1 replica.

When testing with Kind, I often got some problems of docker rate limit and cannot pull the image.

Is it possible to move the images from docker hub to quay.io? Then we will not have such problem.

root@gyliu-dev21:~/.docker# kubectl get pods
NAME                                  READY   STATUS             RESTARTS   AGE
cluster-sample-apiserver-0            0/1     ImagePullBackOff   0          9m43s
cluster-sample-controller-manager-0   0/1     ImagePullBackOff   0          9m55s
cluster-sample-etcd-0                 1/1     Running            0          9m48s

root@gyliu-dev21:~/.docker# kubectl describe po cluster-sample-apiserver-0
Name:         cluster-sample-apiserver-0
Namespace:    default
Priority:     0
Node:         capn-control-plane/172.18.0.2
Start Time:   Sun, 23 May 2021 18:53:43 -0700
Labels:       component-name=nestedapiserver-sample
              controller-revision-hash=cluster-sample-apiserver-7bff79549
              statefulset.kubernetes.io/pod-name=cluster-sample-apiserver-0
Annotations:  <none>
Status:       Pending
IP:           10.244.0.12
IPs:
  IP:           10.244.0.12
Controlled By:  StatefulSet/cluster-sample-apiserver
Containers:
  nestedapiserver-sample:
    Container ID:
    Image:         virtualcluster/apiserver-v1.16.2
    Image ID:
    Port:          6443/TCP
    Host Port:     0/TCP
    Command:
      kube-apiserver
    Args:
      --bind-address=0.0.0.0
      --allow-privileged=true
      --anonymous-auth=true
      --client-ca-file=/etc/kubernetes/pki/apiserver/ca/tls.crt
      --tls-cert-file=/etc/kubernetes/pki/apiserver/tls.crt
      --tls-private-key-file=/etc/kubernetes/pki/apiserver/tls.key
      --kubelet-https=true
      --kubelet-certificate-authority=/etc/kubernetes/pki/apiserver/ca/tls.crt
      --kubelet-client-certificate=/etc/kubernetes/pki/kubelet/tls.crt
      --kubelet-client-key=/etc/kubernetes/pki/kubelet/tls.key
      --kubelet-preferred-address-types=InternalIP,ExternalIP
      --enable-bootstrap-token-auth=true
      --etcd-servers=https://cluster-sample-etcd-0.cluster-sample-etcd.$(NAMESPACE):2379
      --etcd-cafile=/etc/kubernetes/pki/etcd/ca/tls.crt
      --etcd-certfile=/etc/kubernetes/pki/etcd/tls.crt
      --etcd-keyfile=/etc/kubernetes/pki/etcd/tls.key
      --service-account-key-file=/etc/kubernetes/pki/service-account/tls.key
      --service-cluster-ip-range=10.32.0.0/16
      --service-node-port-range=30000-32767
      --authorization-mode=Node,RBAC
      --runtime-config=api/all
      --enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota
      --apiserver-count=1
      --endpoint-reconciler-type=master-count
      --v=2
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Liveness:       tcp-socket :6443 delay=15s timeout=15s period=10s #success=1 #failure=8
    Readiness:      http-get https://:6443/healthz delay=5s timeout=30s period=2s #success=1 #failure=8
    Environment:
      NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /etc/kubernetes/pki/apiserver from cluster-sample-apiserver-client (ro)
      /etc/kubernetes/pki/apiserver/ca from cluster-sample-ca (ro)
      /etc/kubernetes/pki/etcd from cluster-sample-etcd-client (ro)
      /etc/kubernetes/pki/etcd/ca from cluster-sample-etcd-ca (ro)
      /etc/kubernetes/pki/kubelet from cluster-sample-kubelet-client (ro)
      /etc/kubernetes/pki/service-account from cluster-sample-sa (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fltrm (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  cluster-sample-apiserver-client:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-apiserver-client
    Optional:    false
  cluster-sample-etcd-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-etcd
    Optional:    false
  cluster-sample-etcd-client:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-etcd-client
    Optional:    false
  cluster-sample-kubelet-client:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-kubelet-client
    Optional:    false
  cluster-sample-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-ca
    Optional:    false
  cluster-sample-sa:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-sample-sa
    Optional:    false
  default-token-fltrm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fltrm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                     From                         Message
  ----     ------     ----                    ----                         -------
  Normal   Scheduled  9m51s                   default-scheduler            Successfully assigned default/cluster-sample-apiserver-0 to capn-control-plane
  Normal   Pulling    8m (x4 over 9m50s)      kubelet, capn-control-plane  Pulling image "virtualcluster/apiserver-v1.16.2"
  Warning  Failed     7m55s (x4 over 9m44s)   kubelet, capn-control-plane  Failed to pull image "virtualcluster/apiserver-v1.16.2": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/virtualcluster/apiserver-v1.16.2:latest": failed to copy: httpReaderSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/virtualcluster/apiserver-v1.16.2/manifests/sha256:81fc8bb510b07535525413b725aed05765b56961c1f4ed28b92ba30acd65f6fb: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
  Warning  Failed     7m55s (x4 over 9m44s)   kubelet, capn-control-plane  Error: ErrImagePull
  Warning  Failed     7m41s (x6 over 9m43s)   kubelet, capn-control-plane  Error: ImagePullBackOff
  Normal   BackOff    4m38s (x19 over 9m43s)  kubelet, capn-control-plane  Back-off pulling image "virtualcluster/apiserver-v1.16.2"

@Fei-Guo ^^

Admiralty

I bumped into Admiralty recently. Could it be part of the solution?

https://github.com/virtual-kubelet/virtual-kubelet#admiralty-multi-cluster-scheduler

Enable different kubernetes distribution support

This is an issue migrate from kubernetes-retired/multi-tenancy#1479

The virtualcluster only support vanila Kubernetes now, but the real use case is that many customers are using some different Kubernetes distributions, it would be great to enable the virtualcluster can support OpenShift and other Kubernetes clusters which has passed the conformance test at https://github.com/cncf/k8s-conformance/tree/master/v1.20

What is the plan to enable multiple kubernetes distribution support? Anyone is working for this?

@Fei-Guo @christopherhein

Who will fill out the ownerreference of the component CR.

As discussed in the PR #11 , we all agree that there will be multiple CRs, NestedAPIServer, NestedEtcd and NestedControllerManager, to associate them, we plan to set each component CR's metav1.OwnerReference as the owner NCP. But who will fill out the OwnerReference for component CRs?

We can let the end user do it, but the metav1.OwnerReference are normally filled out by the controller/operator (as the OwnerReference contains field like ObjectUID that is normally unknonw in advance). I think a more conventional method would be grouping the CRs using the label, and let the NCP controller fill out the metav1.OwnerReference for the component CRs.

For example, we have an NCP CR named NCP1, and for each of its component CR, end user will need to set metav1.Labels[ownerNCP] = NCP1, then after user applying them, the NCP controller will add OwnerReference for them.

What do you think? @christopherhein @vincepri @Fei-Guo @brightzheng100

[Quick Start] Failed connect to cluster

I was following https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/docs/README.md to run a quick. test for CAPN, but in last step, connect to the cluster failed, can anyone help share some insight for what is wrong?

@charleszheng44 ^^

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl --kubeconfig kubeconfig get all -A
Unable to connect to the server: net/http: TLS handshake timeout

I can see the nested control plane is running:

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl get pod
NAME                                  READY   STATUS    RESTARTS   AGE
cluster-sample-apiserver-0            1/1     Running   0          17m
cluster-sample-controller-manager-0   1/1     Running   2          17m
cluster-sample-etcd-0                 1/1     Running   0          17m

But the cluster status is Provisioning

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl get clusters
NAME             PHASE
cluster-sample   Provisioning

Guangyas-MacBook-Pro:kubernetes-sigs guangyaliu$ kubectl get clusters -oyaml
apiVersion: v1
items:
- apiVersion: cluster.x-k8s.io/v1alpha4
  kind: Cluster
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"cluster.x-k8s.io/v1alpha4","kind":"Cluster","metadata":{"annotations":{},"name":"cluster-sample","namespace":"default"},"spec":{"controlPlaneEndpoint":{"host":"cluster-sample-apiserver","port":6443},"controlPlaneRef":{"apiVersion":"controlplane.cluster.x-k8s.io/v1alpha4","kind":"NestedControlPlane","name":"nestedcontrolplane-sample","namespace":"default"},"infrastructureRef":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha4","kind":"NestedCluster","name":"nestedcluster-sample","namespace":"default"}}}
    creationTimestamp: "2021-05-22T14:41:16Z"
    finalizers:
    - cluster.cluster.x-k8s.io
    generation: 1
    managedFields:
    - apiVersion: cluster.x-k8s.io/v1alpha4
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:spec:
          .: {}
          f:controlPlaneEndpoint:
            .: {}
            f:host: {}
            f:port: {}
          f:controlPlaneRef:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:namespace: {}
          f:infrastructureRef:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:namespace: {}
      manager: kubectl-client-side-apply
      operation: Update
      time: "2021-05-22T14:41:16Z"
    - apiVersion: cluster.x-k8s.io/v1alpha4
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            .: {}
            v:"cluster.cluster.x-k8s.io": {}
        f:status:
          .: {}
          f:conditions: {}
          f:controlPlaneReady: {}
          f:observedGeneration: {}
          f:phase: {}
      manager: manager
      operation: Update
      time: "2021-05-22T14:44:32Z"
    name: cluster-sample
    namespace: default
    resourceVersion: "4670"
    uid: c2884e62-87f0-4709-85c5-c6a97e85a631
  spec:
    controlPlaneEndpoint:
      host: cluster-sample-apiserver
      port: 6443
    controlPlaneRef:
      apiVersion: controlplane.cluster.x-k8s.io/v1alpha4
      kind: NestedControlPlane
      name: nestedcontrolplane-sample
      namespace: default
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
      kind: NestedCluster
      name: nestedcluster-sample
      namespace: default
  status:
    conditions:
    - lastTransitionTime: "2021-05-22T14:41:45Z"
      reason: WaitingForInfrastructure
      severity: Info
      status: "False"
      type: Ready
    - lastTransitionTime: "2021-05-22T14:41:45Z"
      status: "True"
      type: ControlPlaneInitialized
    - lastTransitionTime: "2021-05-22T14:41:45Z"
      status: "True"
      type: ControlPlaneReady
    - lastTransitionTime: "2021-05-22T14:41:17Z"
      reason: WaitingForInfrastructure
      severity: Info
      status: "False"
      type: InfrastructureReady
    controlPlaneReady: true
    observedGeneration: 1
    phase: Provisioning
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

kubernetes-sigs / cluster-api-provider-nested Goto Github PK

cluster-api-provider-nested's Introduction

Kubernetes Cluster API Provider Nested

Community, discussion, contribution, and support

Code of conduct

cluster-api-provider-nested's People

Contributors

Stargazers

Watchers

Forkers

cluster-api-provider-nested's Issues

Anytime

Just before rename

Finalizing

Post-rename

Problem

What I did

Build kubectl-vc

Create new CRDs

Install CRD

Create ns, rbac, deployment, ...

Create a new ClusterVersion

Create a new VirtualCluster

Links

TL;DR

Background:

Issue description

Comments from @christopherhein

Links

Recommend Projects

Recommend Topics

Recommend Org