Git Product home page Git Product logo

lieutenant-operator's Introduction

Project Syn: Lieutenant Operator

Kubernetes Operator which implements the backend for Lieutenant API.

The operator keeps inventory about all the tenants and clusters in a SYN managed k8s cluster.

It also handles the management of some requirements like Git repositories and secret management. It can automatically populate Git repositories with template files when a new cluster is added. It will also generate a token to be used by Steward.

This repository is part of Project Syn. For documentation on Project Syn and this component, see https://syn.tools.

Documentation

Documentation for this component is written using Asciidoc and Antora. It is located in the docs/ folder. The Divio documentation structure is used to organize its content.

You can use the make docs-serve command and then browse to http://localhost:2020 to preview the documentation.

Deployment

A Kustomize setup is available under config/samples/deployment.

Example:

kubectl create ns syn-lieutenant
kubectl -n syn-lieutenant apply -k config/crd/
kubectl -n syn-lieutenant apply -k config/samples/deployment

Some example data to test the operator is available under config/samples/.

Development

The Operator is implemented using Kubebuilder.

There are many make targets available. Run make help to get a list of relevant targets.

Contributing and license

This library is licensed under BSD-3-Clause. For information about how to contribute see CONTRIBUTING.

lieutenant-operator's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

zugao

lieutenant-operator's Issues

Generate Roles and RoleBindings for Access Control for Tenants

Context

There is currently no way to only allow a Tenant to access (CRUD) the Cluster objects belonging to it. A simple way to achieve this is to automatically manage a ServiceAccount, Role and RoleBinding per Tenant.

Each Tenant object automatically gets it's own ServiceAccount which is bound to the corresponding Role (via RoleBinding) which only allows access to the Cluster objects belonging to this Tenant by leveraging the resourceNames field. Each of them having an ownerRef to the Tenant object.

Alternatives

There was an earlier idea to use Open Policy Agent for that what turned out to not be possible as OPA is active in API admission which doesn't include read access to the API.

OPA should be used only for validation, not for access control.

Ref: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#what-are-they

Admission controllers limit requests to create, delete, modify or connect to (proxy). They do not support read requests.

Fix reconcile loop with GitRepo updates

Updating the git repository every on every reconcile triggers a loop.
That's because the gitRepo controller adds fields that aren't set in the
gitRepoTemplates of the tenants and clusters. So it will always apply
changes.

The operator will permanently reconcile all tenants and clusters as they
have the Gitrepository as their secondary objects. That triggers and
endless loop of reconciles.

Panic on Cluster Deletion

When deleting a cluster the operator panics because of a nil pointer deference, while trying to remove the steward secret.

Steps to Reproduce the Problem

It is unclear which steps exactly lead to this.
The general steps where:

  1. Create cluster
  2. Reset the bootstrap token a day later
  3. Remove all secrets referenced by the cluster catalog in Vault (according to https://kb.vshn.ch/vshnsyn/how-tos/decommission.html)
  4. Delete cluster

Actual Behavior

The operator panics while trying to handle the deletion of a cluster.

Cluster Resource

apiVersion: syn.tools/v1alpha1
kind: Cluster
metadata:
  creationTimestamp: "2021-07-27T07:10:58Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2021-07-28T13:58:00Z"
  finalizers:
  - cluster.lieutenant.syn.tools
  generation: 6
  labels:
    syn.tools/tenant: t-ancient-morning-1764
  name: c-cold-morning-3608
  namespace: lieutenant-int
  ownerReferences:
  - apiVersion: syn.tools/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: t-ancient-morning-1764
    uid: 4e48dae4-8604-4dfb-9422-51792da07c5d
  resourceVersion: "315984491"
  selfLink: /apis/syn.tools/v1alpha1/namespaces/lieutenant-int/clusters/c-cold-morning-3608
  uid: ae1f94ff-4725-45a9-967d-aa09b12ee096
spec:
  displayName: APPUiO OCP4 Exoscale Setup Test
  facts:
    cloud: exoscale
    distribution: openshift4
    lieutenant-instance: lieutenant-int
    region: ch-gva-2
    service_level: zero
  gitHostKeys: |
    git.vshn.net ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCnE1dMkh+3uHWck+cTvQqeNUW0lj1uVcIC9JX2Tg6gmkKCYA73+o+I7vo4g6nPtSOAfITvYdHJLzwE9GwlSFsXHMR9q0ErWl2wC+w6FawLMz9//5XqiBi2qq/8WnWp3ecY16jDoGRW4eymT+USFHKJVi696XBy3WE/0BBapPZ58WPqkKN6A27qkIK6FehI80f+zN4ZqikdwWuCFs35fsimcmLnWqWPm8zbOkgCiB+ov4O/xmRNHwJWCk/qzU6X/M9YtMXzAa5mjwDvcHSAizFD3a3Fv68G1VsmRZ0THLrRKM/WOxrWNZoimSNgyjTzoCwiKeckvL5+hpNcNSW+eBPt
    git.vshn.net ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO9EkPcVdsz/oVTI2VJkBlq8Mv/dg3rhcbgzAEKyiwUG
  gitRepoTemplate:
    apiSecretRef:
      name: vshn-gitlab
    deletionPolicy: Delete
    deployKeys:
      steward:
        key: |
          AAAAB3NzaC1yc2EAAAADAQABAAACAQC9WS07j2M7Sz5+ox8ew7y3bZJ0OHA6lkSNAXu+eUvVTOqlCMFjujaNZo5tX+019e/KZnhi2/JtBK8mCTXAyzs3xrJvYbACIOwHa33IAhfyEmYa/KtNYJK3dhYVclh10+jUJqMo5cK/41vIw2ApCEMykpbU4rPFsomjGp7igcGq9Zb3vyvf1dtgVJ0bf5psb45a6dsnKSoHMqxGkrfTj9kb/kURMSLkGGSxEdhUSkonbAdNToq+2TjTJEPWM488r7MlG/rsd/7+RQhzHGD8V/+90dDzyJ5YnEfpkCrPB2UxVJWRt5ccMTpnuPtuzrn57NY0mruWpK0JmlnryoEv0aoDaT80YSqZVy09WKcXc4bZ2oR2oaFJkKEfBfo5nVsEaX6fEUYUxX8ALgF1+7SDBwYwg2+/km4o/flUE3UhP0fqpkTGOlB9W/hkZe6ksNPcSWtuPJmeVRghHoY19Kw2nIZ0+3CDpEIWopfmzZNzI8A0T0xv1gNFyM+NiIrDb0ju2YL090WJ4X9iwdIluRMmoL0Nu2yLB/YIHRlFPszZma6c2ZPTeq2O0o7zxqk0ynT8GGiyD4Ns1X+ei2k5uAi7pxG2KdrrF0NpNPBpCHefe184ZtSie5ySkn2agTayyLZGJrbqxa/9uI+2KgIyhyvI5MaMziOP9PhrIhcjtombNedukQ==
        type: ssh-rsa
    displayName: APPUiO OCP4 Exoscale Setup Test
    path: syn-dev/cluster-catalogs
    repoName: c-cold-morning-3608
    repoType: auto
  gitRepoURL: ssh://[email protected]/syn-dev/cluster-catalogs/c-cold-morning-3608.git
  tenantRef:
    name: t-ancient-morning-1764
status:
  bootstrapToken:
    token: <token>
    validUntil: "2021-07-29T13:10:21Z"

Vault Secret

$ vault kv get clusters/kv/t-ancient-morning-1764/c-cold-morning-3608/steward 
====== Metadata ======
Key              Value
---              -----
created_time     2021-07-27T07:10:58.227801984Z
deletion_time    n/a
destroyed        false
version          1

==== Data ====
Key      Value
---      -----
token    <token>

The original secret should still be present on vault-int.

Log

{"level":"info","ts":1627541860.7444935,"logger":"controller_cluster","msg":"Reconciling Cluster","Request.Namespace":"lieutenant-int","Request.Name":"c-cold-morning-3608"}
E0729 06:57:40.976044       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 2089 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1551940, 0x223a5e0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa6
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x86
panic(0x1551940, 0x223a5e0)
	/usr/local/go/src/runtime/panic.go:965 +0x1b9
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).removeSecret(0xc000bf6680, 0xc0007350c0, 0x2a, 0x0, 0x0, 0x20, 0x14fb360)
	/app/pkg/vault/client.go:178 +0x1d0
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).RemoveSecrets(0xc000bf6680, 0xc0004af4a0, 0x1, 0x1, 0xc000bf6680, 0x0)
	/app/pkg/vault/client.go:153 +0x7e
github.com/projectsyn/lieutenant-operator/pkg/vault.HandleVaultDeletion(0x192d338, 0xc000b396c0, 0xc00047dd40, 0x203000, 0x0, 0x0, 0xc00042de00)
	/app/pkg/vault/reconcile_steps.go:99 +0x2b9
github.com/projectsyn/lieutenant-operator/pkg/pipeline.RunPipeline(0x192d338, 0xc000b396c0, 0xc00047dd40, 0xc0006a19a8, 0x8, 0x8, 0x0, 0xc0006a19c0, 0x40e078, 0x30)
	/app/pkg/pipeline/pipeline.go:61 +0xb1
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.clusterSpecificSteps(0x192d338, 0xc000b396c0, 0xc00047dd40, 0x0, 0x0, 0x0, 0x0)
	/app/pkg/controller/cluster/cluster_reconcile.go:70 +0x1d8
github.com/projectsyn/lieutenant-operator/pkg/pipeline.RunPipeline(0x192d338, 0xc000b396c0, 0xc00047dd40, 0xc0006a1bd8, 0x6, 0x6, 0x13, 0x18f7b60, 0xc000b396c0, 0x0)
	/app/pkg/pipeline/pipeline.go:61 +0xb1
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile(0xc000678558, 0xc000c07810, 0xe, 0xc000755170, 0x13, 0x1aeac146c1, 0xc000364480, 0xc00014e788, 0xc00014e750)
	/app/pkg/controller/cluster/cluster_reconcile.go:52 +0x570
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000660300, 0x15bb220, 0xc0000ab620, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x166
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000660300, 0xc000308500)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xb0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(...)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00067a420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5f
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00067a420, 0x3b9aca00, 0x0, 0x1, 0xc000114420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0x105
k8s.io/apimachinery/pkg/util/wait.Until(0xc00067a420, 0x3b9aca00, 0xc000114420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x32d
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x130e9f0]

goroutine 2089 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x109
panic(0x1551940, 0x223a5e0)
	/usr/local/go/src/runtime/panic.go:965 +0x1b9
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).removeSecret(0xc000bf6680, 0xc0007350c0, 0x2a, 0x0, 0x0, 0x20, 0x14fb360)
	/app/pkg/vault/client.go:178 +0x1d0
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).RemoveSecrets(0xc000bf6680, 0xc0004af4a0, 0x1, 0x1, 0xc000bf6680, 0x0)
	/app/pkg/vault/client.go:153 +0x7e
github.com/projectsyn/lieutenant-operator/pkg/vault.HandleVaultDeletion(0x192d338, 0xc000b396c0, 0xc00047dd40, 0x203000, 0x0, 0x0, 0xc00042de00)
	/app/pkg/vault/reconcile_steps.go:99 +0x2b9
github.com/projectsyn/lieutenant-operator/pkg/pipeline.RunPipeline(0x192d338, 0xc000b396c0, 0xc00047dd40, 0xc0006a19a8, 0x8, 0x8, 0x0, 0xc0006a19c0, 0x40e078, 0x30)
	/app/pkg/pipeline/pipeline.go:61 +0xb1
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.clusterSpecificSteps(0x192d338, 0xc000b396c0, 0xc00047dd40, 0x0, 0x0, 0x0, 0x0)
	/app/pkg/controller/cluster/cluster_reconcile.go:70 +0x1d8
github.com/projectsyn/lieutenant-operator/pkg/pipeline.RunPipeline(0x192d338, 0xc000b396c0, 0xc00047dd40, 0xc0006a1bd8, 0x6, 0x6, 0x13, 0x18f7b60, 0xc000b396c0, 0x0)
	/app/pkg/pipeline/pipeline.go:61 +0xb1
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile(0xc000678558, 0xc000c07810, 0xe, 0xc000755170, 0x13, 0x1aeac146c1, 0xc000364480, 0xc00014e788, 0xc00014e750)
	/app/pkg/controller/cluster/cluster_reconcile.go:52 +0x570
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000660300, 0x15bb220, 0xc0000ab620, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x166
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000660300, 0xc000308500)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xb0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(...)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00067a420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5f
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00067a420, 0x3b9aca00, 0x0, 0x1, 0xc000114420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0x105
k8s.io/apimachinery/pkg/util/wait.Until(0xc00067a420, 0x3b9aca00, 0xc000114420)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x32d

Expected Behavior

The operator is able to handle the cluster deletion or return an error without crashing if the cluster resource is in an inconsistent state.

New field for default cluster catalog GitRepo location in Tenant object

Add a new field to the Tenant object to allow setting a default location for cluster catalog repositories and specify which secret will be used:

spec:
  clusterCatalogGitRepo:
    baseURL: ssh://[email protected]/syn/cluster-catalogs/
    apiSecretRef:
      name: git-example-credentials

This field can then be used by the API to figure out where the default location of cluster-catalog repositories are stored.

Example: spec.clusterCatalogGitRepo.baseURL could be set to ssh://[email protected]/syn/cluster-catalogs/ which is then used by the API to set the corresponding field when creating a cluster without explicitly specifying a location for the cluster catalog git repository.

Debug Issues with Vault on Object Deletion

Fix deletion when Vault is deactivated:

E0721 09:04:34.474812       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 1074 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15db4a0, 0x25a6b40)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x15db4a0, 0x25a6b40)
	/usr/local/go/src/runtime/panic.go:969 +0x166
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).listSecrets(0x0, 0xc0006463f0, 0x26, 0x203000, 0xc000d35640, 0x127a45c, 0xc00003c500, 0xc000798120)
	/app/pkg/vault/client.go:236 +0x37
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).removeSecret(0x0, 0xc0006463f0, 0x26, 0x0, 0x0, 0x20, 0x156a7a0)
	/app/pkg/vault/client.go:157 +0x5a
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).RemoveSecrets(0x0, 0xc000a589c0, 0x1, 0x1, 0x1a436a0, 0xc00053f890)
	/app/pkg/vault/client.go:146 +0x7e
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile.func1(0x0, 0x0)
	/app/pkg/controller/cluster/cluster_reconcile.go:123 +0xcc1
k8s.io/client-go/util/retry.OnError.func1(0x13, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:51 +0x3c
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000d35bd0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:292 +0x51
k8s.io/client-go/util/retry.OnError(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0x1889db0, 0xc000d35c90, 0x1161bd6, 0xc000270fe0)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:50 +0xa6
k8s.io/client-go/util/retry.RetryOnConflict(...)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:104
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile(0xc0004e0140, 0xc0008ae280, 0xa, 0xc000270fe0, 0x12, 0x0, 0xbfbdc9c8981f5bc9, 0xc000bb4d80, 0xc000bb4cf8)
	/app/pkg/controller/cluster/cluster_reconcile.go:45 +0x266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0009d6300, 0x164dcc0, 0xc0009b0080, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x161
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0009d6300, 0xc000f84200)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0009d6300)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0003100b0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5f
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003100b0, 0x3b9aca00, 0x0, 0x1, 0xc00069a3c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc0003100b0, 0x3b9aca00, 0xc00069a3c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x305
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13986e7]

goroutine 1074 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x15db4a0, 0x25a6b40)
	/usr/local/go/src/runtime/panic.go:969 +0x166
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).listSecrets(0x0, 0xc0006463f0, 0x26, 0x203000, 0xc000d35640, 0x127a45c, 0xc00003c500, 0xc000798120)
	/app/pkg/vault/client.go:236 +0x37
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).removeSecret(0x0, 0xc0006463f0, 0x26, 0x0, 0x0, 0x20, 0x156a7a0)
	/app/pkg/vault/client.go:157 +0x5a
github.com/projectsyn/lieutenant-operator/pkg/vault.(*BankVaultClient).RemoveSecrets(0x0, 0xc000a589c0, 0x1, 0x1, 0x1a436a0, 0xc00053f890)
	/app/pkg/vault/client.go:146 +0x7e
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile.func1(0x0, 0x0)
	/app/pkg/controller/cluster/cluster_reconcile.go:123 +0xcc1
k8s.io/client-go/util/retry.OnError.func1(0x13, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:51 +0x3c
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000d35bd0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:292 +0x51
k8s.io/client-go/util/retry.OnError(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0x1889db0, 0xc000d35c90, 0x1161bd6, 0xc000270fe0)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:50 +0xa6
k8s.io/client-go/util/retry.RetryOnConflict(...)
	/go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:104
github.com/projectsyn/lieutenant-operator/pkg/controller/cluster.(*ReconcileCluster).Reconcile(0xc0004e0140, 0xc0008ae280, 0xa, 0xc000270fe0, 0x12, 0x0, 0xbfbdc9c8981f5bc9, 0xc000bb4d80, 0xc000bb4cf8)
	/app/pkg/controller/cluster/cluster_reconcile.go:45 +0x266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0009d6300, 0x164dcc0, 0xc0009b0080, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x161
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0009d6300, 0xc000f84200)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0009d6300)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0003100b0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5f
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003100b0, 0x3b9aca00, 0x0, 0x1, 0xc00069a3c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc0003100b0, 0x3b9aca00, 0xc00069a3c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x305

Git repo URL not set after cluster creation

After I created a new cluster (id: c-dry-dust-6989) through control.vshn.net the operator failed to update the git repo URL in the cluster object.

Steps to Reproduce the Problem

  1. Create cluster through control.vshn.net with Repository Type "auto"
  2. commodore -d tmp catalog compile $cluster -> Error: > API did not return a repository URL for cluster '$cluster'

Actual Behavior

k -n lieutenant-prod logs lieutenant-prod-controller-manager-57674d9b84-v7s4l

2021-08-19T12:27:33.227Z        INFO    controller-runtime.manager.controller.gitrepo.RunPipeline       running steps   {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod", "steps": ["deletion", "add deletion protection", "handle finalizer", "update object"]}
2021-08-19T12:27:33.227Z        INFO    controller-runtime.manager.controller.gitrepo.RunPipeline       ran step        {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod", "step": "deletion", "result": {"Abort":false,"Err":null,"Requeue":false}, "step_index": 0}
2021-08-19T12:27:33.227Z        INFO    controller-runtime.manager.controller.gitrepo.RunPipeline       ran step        {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod", "step": "add deletion protection", "result": {"Abort":false,"Err":null,"Requeue":false}, "step_index": 1}
2021-08-19T12:27:33.227Z        INFO    controller-runtime.manager.controller.gitrepo.RunPipeline       ran step        {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod", "step": "handle finalizer", "result": {"Abort":false,"Err":null,"Requeue":false}, "step_index": 2}
2021-08-19T12:27:33.227Z        DEBUG   controller-runtime.manager.controller.gitrepo   updating object {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod"}          2021-08-19T12:27:33.286Z        DEBUG   controller-runtime.manager.controller.gitrepo   updating object status  {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod"}
2021-08-19T12:27:33.296Z        ERROR   controller-runtime.manager.controller.gitrepo   conflict while updating object; requeueing      {"reconciler group": "syn.tools", "reconciler kind": "GitRepo", "name": "c-dry-dust-6989", "namespace": "lieutenant-prod", "error": "Operation cannot be fulfilled on gitrepos.syn.tools \"c-dry-dust-6989\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/projectsyn/lieutenant-operator/pipeline.RunPipeline                                                                                                                                                                                      /home/runner/work/lieutenant-operator/lieutenant-operator/pipeline/pipeline.go:66
github.com/projectsyn/lieutenant-operator/pipeline.Common                                                                                                                                                                                           /home/runner/work/lieutenant-operator/lieutenant-operator/pipeline/pipeline.go:87
github.com/projectsyn/lieutenant-operator/pipeline.RunPipeline                                                                                                                                                                                      /home/runner/work/lieutenant-operator/lieutenant-operator/pipeline/pipeline.go:66
github.com/projectsyn/lieutenant-operator/controllers.(*GitRepoReconciler).Reconcile                                                                                                                                                                /home/runner/work/lieutenant-operator/lieutenant-operator/controllers/gitrepo_controller.go:63
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler                                                                                                                                                               /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem                                                                                                                                                            /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2                                                                                                                                                                  /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214

Expected Behavior

The URL is added to the cluster object.

Implement GitRepo Deletion

Implement the Git repository cleanup process with a retention policy feature (retain, delete) and by leveraging the finalizer pattern for actually deleting repositories.

Task Deliverables

  • Controller is able to delete managed Git repositories
  • Deletion policy is implemented
  • Finalizer pattern is used for actual deletion process

Duplicate repoName leads to race condition

It is possible to configure the same repoName for multiple GitRepos, which sends lieutenant-operator into an endless deploy-key-reconciliation-loop

Steps to Reproduce the Problem

  1. Create a cluster, using "gitops1" as repoName.
  2. Create a second cluster, also use "gitops1" as a repoName.

Actual Behavior

{"level":"info","ts":1596810672.7987385,"logger":"controller_gitrepo","msg":"Reconciling GitRepo","Request.Namespace":"lieutenant","Request.Name":"c-dawn-lake-5378"}
{"level":"info","ts":1596810675.2224236,"logger":"controller_gitrepo","msg":"keys differed from CRD, keys re-applied to repository","Request.Namespace":"lieutenant","Request.Name":"c-dawn-lake-5378"}
{"level":"info","ts":1596810675.2419648,"logger":"controller_gitrepo","msg":"Reconciling GitRepo","Request.Namespace":"lieutenant","Request.Name":"c-nameless-river-4010"}
{"level":"info","ts":1596810676.9737165,"logger":"controller_gitrepo","msg":"forcing re-creation of key steward","Request.Namespace":"lieutenant","Request.Name":"c-nameless-river-4010"}
{"level":"info","ts":1596810677.6824875,"logger":"controller_gitrepo","msg":"removing key steward; existing on repo but not in CRDs","Request.Namespace":"lieutenant","Request.Name":"c-nameless-river-4010"}
{"level":"info","ts":1596810678.7076182,"logger":"controller_gitrepo","msg":"keys differed from CRD, keys re-applied to repository","Request.Namespace":"lieutenant","Request.Name":"c-nameless-river-4010"}

In my case this actually lead to the situation where NO deploy key was configured on Gitlab.com at all.

Expected Behavior

One of:

  • Lieutenant prevents creation of GitRepos with duplicated repoNames (API? admission hook? maybe possible to configure in the CRD?)
  • Lieutenant supports this configuration and does not fight itself

Implement Tenant GitRepo Bootstrapping

Automate creation of cluster config file in tenant configuration repository.
For each new cluster, an empty YAML file needs to be created in the tenant's config repo. The file must be called $CLUSTER_ID.yml (note the .yml without a)

Task Deliverables

  • Operator automatically creates $CLUSTER_ID.yml files in the tenant repo

Set OwnerReference of Clusters to Tenant

Set an ownerReference on newly created clusters to the respective tenant object.

Just as we do for GitRepo objects.

The result is that cluster objects will then automatically be deleted if the owner (tenant) is deleted. This can be good or bad, maybe an opt-out mechanism (i.e. annotation) needs to be available?

Listing files from GitLab is limited and needs pagination

The Operator tries to recreate files via GitLab API, even though they exist.
Turns out, the listing files from GitLab needs pagination. By default, 20 entries are returned per page. If a repository has more than 20 files, weird things start to happen.

Steps to Reproduce the Problem

  1. Have a repository with more than 20 entries
  2. Reconcile a Cluster

Actual Behavior

Listing the tree only returned 20 results.

Error log:

{
   "level":"error",
   "ts":1613489857.210359,
   "logger":"controller-runtime.controller",
   "msg":"Reconciler error",
   "controller":"gitrepo-controller",
   "request":"lieutenant-prod/t-ja3px4",
   "error":"step git repo specific steps failed: POST https://git.vshn.net/api/v4/projects/1280/repository/commits: 400 {message: A file with this name already exists}",
   "stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88"
}

Expected Behavior

Listing and comparing the tree should compare all files, not just 20

Rethink Reconcile Handling

At first the reconcile functions were rather simple and the functionality was added directly there.

But we're now at a point where this doesn't scale very well. So I'd like to introduce some more structure to the reconcile functions. We have to to things in certain orders, as they could block or affect other steps during the reconcile. All that logic currently resides inside the respective reconcile loops. This makes it very hard to introduce new functionality that has to be done in certain orders. Also there's a lot of repetition as each reconcile has to do various steps that are common for all the reconcile loops (adding certain labels, or writing back the manipulated CR to the API, etc.).

With some inspiration from https://crossplane.io/docs/master/contributing/services_developer_guide.html I'd like to propose some changes:

We'll define an interface(s) that exposes functions for:

  • Fetching the CR
  • Checking the CR against all mandatory fields (labels etc.)
  • Comparing the state of the external resource (Vault, Git, etc) against the CR definition
  • Triggering the state change for the external resource
  • Reflect the actual state in the CR
  • Write the CR back to the K8s API
  • Other things and maybe some future functionality

These functions are roughly in two categories: determine the state and apply the state and may be split from each other.

Then we'd need some controller, that will go through these functions and determine what actions have to be taken. This sounds like something that could easily be modelled by a FSM (finite state machine). Where it will transition through the various possible states until it reaches the final state where everything is in sync.

This has various benefits:

Dependency Dashboard

This issue provides visibility into Renovate updates and their statuses. Learn more

This repository currently has no open or pending branches.

Implement Cluster GitRepo Bootstrapping

Initialise empty catalog Git repository. For Commodore to work properly there needs to be at least a master branch present.

Task Deliverables

  • New cluster catalog git repos are bootstrapped with a master branch

Fix issues with gitlab.com

Managing git repos on gitlab.com used to work, but now fails with the following error:

{"level":"error","ts":1594726227.2132478,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"gitrepo-controller","request":"lieutenant/c-divine-dew-5824","error":"status code was 503","...}

As this breaks the getting started guide we should investigate and eventually fix the issue.

Cluster object not immediately reconciled after creation

The cluster object isn't immediately reconciled after the corresponding GitRepo object has been created.

Steps to Reproduce the Problem

  1. Run through projectsyn/documentation#116 with the latest tag for the operator
  2. See that the cluster object doesn't contain the gitRepoURL immediately, only after the full reconcile

Actual Behavior

Commodore cannot compile the catalog until the gitRepoURL is available in the cluster object.

Expected Behavior

gitRepoURL is immediately available in the cluster object.

Create ServiceAccount token secrets on Kubernetes 1.24+

Context

Kubernetes 1.24+ doesn't create service account token secrets by default anymore. However, Lieutenant expects those secrets to be present, as the token is used by the registered clusters to authenticate themselves to Lieutenant (and Vault).

We should ensure Lieutenant creates a secret as documented in https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token, whenever it creates a service account.

Alternatives

Rework the cluster authentication completely.

Introduce a template for tenants

Context

In #110 we introduced a template for Clusters. The same is also valuable for Tenants. This would not only address the request projectsyn/lieutenant-api#89 but would give the ability to set all sorts of defaults.

For the Cluster template, it made sense to embedd it into the Tenant. For the Tenant template there is no such "parent" object to make use of. Instead a new custom resouces (TenantTemplate) must be introduced.

Scenario: Template defined
  Given the TenantTemplate
  """
  apiVersion: syn.tools/v1alpha1
  kind: TenantTemplate
  metadata:
    name: default
  spec:
    clusterTemplate:
      gitRepoTemplate:
        apiSecretRef:
          name: "git-credentials"
        repoName: "{{ .Name }}"
  """
  When I create a tenant
  """
  apiVersion: syn.tools/v1alpha1
  kind: Tenant
  metadata:
    name: my-tenant
  spec:
    displayName: "My Tenant"
    clusterTemplate:
      gitRepoTemplate:
        apiSecretRef:
          name: "my-credentials"
  """
  And the lieutenant operator has reconciled that tenant
  Then the tenant has the value
  """
  apiVersion: syn.tools/v1alpha1
  kind: Tenant
  metadata:
    name: my-tenant
    annotations:
      lieutenant.syn.tools/tenant-template: "default"
  spec:
    displayName: "My Tenant"
    clusterTemplate:
      gitRepoTemplate:
        apiSecretRef:
          name: "my-credentials"
        repoName: "{{ .Name }}"
  """

Scenario: No template defined
  Given no TenantTemplate object exists
  When I create a tenant
  """
  apiVersion: syn.tools/v1alpha1
  kind: Tenant
  metadata:
    name: my-tenant
  spec:
    displayName: "My Tenant"
  """
  And the lieutenant operator has reconciled that tenant
  Then the tenant has the value
  """
  apiVersion: syn.tools/v1alpha1
  kind: Tenant
  metadata:
    name: my-tenant
  spec:
    displayName: "My Tenant"
  """

Out of scope

Support for more than one template. This is left for the future should we have the need for it.

Alternatives

See projectsyn/lieutenant-api#89.

Further @srueg and @corvus-ch discussed the use of a custom resource vs. the use of a ConfigMap mounted into the pod. From an implementation perspective, both aproaches would be more or less the same effort. The custom resource winns for one and a half reasons.

  1. A custom resource is a typed data structure. Its documentation is accessible via kubectl explaine.
  2. We will be able to support more than one templates. The idea is to set an annoation on the Tenant (or add a new field) that determines which template to make use of.

Thoughts on implemenation

Improve Error Handling

Context

Currently the operator logs are the only place where errors show up. These is a rather low level and not visible for regular users.

To improve this, the operator should expose errors and other information in the form of .status conditions and Kubernetes events.

See also:

Alternatives

An alternative would be to create a custom field in the object's status containing any error messages. While this might be easier to implement, it would not adhere to Kubernetes standards and makes it therefore harder to work with.

Implement GitRepo DisplayName

Implement DisplayName field for GitRepos. This should be set on the created Git repository as description (i.e. GitLab: "Project description", GitHub: "Description").
The operator should default this field with the DisplayName of the tenant or cluster if possible.

Task Deliverables

  • New CRD field "DisplayName"
  • Sync the display name of the tenant or cluster respectively

Write Cluster Token to Vault

The operator should store the cluster service account token in Vault automatically.

Task Deliverables

  • Service account token of clusters are stored in vault
  • The tokens are stored under kv/$TENANT_ID/$CLUSTER_ID/steward token=$CLUSTER_TOKEN

Adjust Application Plumbing to K8up

Context

In K8up we put a lot of effort in the plumbing of the operator. Adjust this repository to adhere to the plumbing of K8up. This includes:

  • GitHub issue templates
  • GitHub actions
  • Release workflow
  • automated Changelog generation -> Done in #170
  • Plumbing for testing

Rework Documentation of Lieutenant Operator

Currently the documentation of Lieutenant Operator is very scarce and should be updated to make using and understanding the Lieutenant Operator a breeze.

The documentation should:

  • describe the concept of Lieutenant Operator: What does it do? How does it work?
  • show the configuration options

Also the start page of https://syn.tools/lieutenant-operator/index.html should give a short intro to Lieutenant Operator, so that it gives an idea what it's all about.

Adhere to https://documentation.divio.com/ for putting the pages and content into the right context.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

dockerfile
Dockerfile
github-actions
.github/workflows/e2e.yml
  • actions/checkout v4
  • actions/setup-go v5
  • mikepenz/action-junit-report v4
.github/workflows/lint.yml
  • actions/checkout v4
  • actions/setup-go v5
.github/workflows/master.yml
  • actions/checkout v4
  • actions/setup-go v5
.github/workflows/release.yml
  • actions/checkout v4
  • actions/setup-go v5
  • mikepenz/release-changelog-builder-action v4
  • goreleaser/goreleaser-action v6
.github/workflows/test.yml
  • actions/checkout v4
  • actions/setup-go v5
gomod
go.mod
  • go 1.22.0
  • go 1.22.4
  • dario.cat/mergo v1.0.0
  • github.com/banzaicloud/bank-vaults/pkg/sdk v0.8.3
  • github.com/elastic/crd-ref-docs v0.0.12
  • github.com/go-logr/logr v1.4.2
  • github.com/go-logr/zapr v1.3.0
  • github.com/hashicorp/vault/api v1.14.0
  • github.com/prometheus/client_golang v1.19.1
  • github.com/ryankurte/go-structparse v1.2.0
  • github.com/stretchr/testify v1.9.0
  • github.com/xanzy/go-gitlab v0.105.0
  • go.uber.org/zap v1.27.0
kustomize
config/samples/deployment/kustomization.yaml
npm
e2e/package.json
  • bats 1.11.0

  • Check this box to trigger a request for Renovate to run again on this repository

Research Feasibility to Replace Lieutenant Operator by Crossplane

Context

Crossplane has matured a lot since we started with Lieutenant Operator. Since the introduction of the Composition feature it might be possible to completely replace Lieutenant Operator by Crossplane, shipping Compositions and Providers as part of Project Syn to enable the core features needed by the project.

Lieutenant Operator provides three objects:

  • Tenant and Cluster: Both contain some information and the operator generate a GitRepo object out of them.
  • GitRepo: Manages git repositories on GitLab and files in it.

All these objects could be replaced by XRDs and Compositions.

We should research the detail functionality which we would need and how it could be done with Crossplane. This research should help to decide if it's feasible to replace the Lieutenant Operator.

Upsides:

  • No software maintenance needed anymore
  • Integrating in a growing ecosystem, leveraging the community
  • Being the "Configuration Management" system for Crossplane

Downside:

  • "Vendor" lock-in to Crossplane
  • Quite some effort

Create Arbitrary Vault Secrets

Implement the functionality to create arbitrary secrets in Vault for a cluster.

Currently only the cluster token is automatically stored in Vault for a new cluster (from the service account). It should be possible to define arbitrary secrets which are then generated and stored in Vault.
A similar concept as we have to create files in git repos (map of file templates) could be done for Vault secrets. The exact name, length and character set for a secret should be configurable. A secret must be stored under the path <tenant-id>/<cluster-id>/.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.