Git Product home page Git Product logo

syself / cluster-api-provider-hetzner Goto Github PK

View Code? Open in Web Editor NEW
510.0 8.0 46.0 28.94 MB

Kubernetes Cluster API Provider Hetzner provides a consistent deployment and day 2 operations of "self-managed" Kubernetes clusters on Hetzner.

License: Apache License 2.0

Dockerfile 0.40% Makefile 3.83% Go 87.94% Python 2.72% Shell 5.11%
cluster-api cluster-api-provider-hetzner k8s-provider-hetzner k8s-sig-cluster-lifecycle k8s-sig-cluster-api kubernetes k8s hetzner hcloud devops

cluster-api-provider-hetzner's People

Contributors

a5r0n avatar alessiodionisi avatar aniruddha2000 avatar apricote avatar batistein avatar chrisludwig avatar dependabot[bot] avatar guettli avatar janiskemper avatar kranurag7 avatar lion7 avatar lucasrattz avatar madbbb avatar mstarostik avatar preisschild avatar privatecoder avatar prometherion avatar pucilpet avatar razashahid107 avatar rpahli avatar sayanta66 avatar souhardya79 avatar syself-bot[bot] avatar testwill avatar yrs147 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-api-provider-hetzner's Issues

Warning ReconcileError secrets "my-cluster-kubeconfig" not found

/kind bug

What steps did you take and what happened:

I followed the docs, but applying the created yaml fails:

guettli@p15$ k get events -A --sort-by=.metadata.creationTimestamp 

default     8m14s       Warning   ReconcileError                machinehealthcheck/my-cluster-control-plane-unhealthy-5m   error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/my-cluster": failed to retrieve kubeconfig secret for Cluster default/my-cluster: secrets "my-cluster-kubeconfig" not found
default     8m14s       Warning   ReconcileError                machinehealthcheck/my-cluster-md-0-unhealthy-5m            error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/my-cluster": failed to retrieve kubeconfig secret for Cluster default/my-cluster: secrets "my-cluster-kubeconfig" not found
default     8m35s       Warning   ReconcileError                machinedeployment/my-cluster-md-0                          failed to retrieve HCloudMachineTemplate external object "default"/"my-cluster-md-0": hcloudmachinetemplates.infrastructure.cluster.x-k8s.io "my-cluster-md-0" not found
default     8m35s       Normal    SuccessfulCreate              machinedeployment/my-cluster-md-0                          Created MachineSet "my-cluster-md-0-7465476f6d"
default     8m4s        Normal    ChangeLoadBalancerAlgorithm   hetznercluster/my-cluster                                  Changed load balancer algorithm
default     8m4s        Normal    CreateLoadBalancer            hetznercluster/my-cluster                                  Created load balancer
default     8m          Normal    SuccessfulCreate              hcloudmachine/my-cluster-control-plane-72v5w               Created new server with id 20435964
default     7m56s       Warning   ReconcileError                machinehealthcheck/my-cluster-control-plane-unhealthy-5m   error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/my-cluster": Get "https://142.132.242.98:443/api?timeout=10s": dial tcp 142.132.242.98:443: connect: connection refused
default     7m45s       Normal    AddedAsTargetToLoadBalancer   hetznercluster/my-cluster                                  Added new server with id 20435964 to the loadbalancer 719385
default     6m52s       Warning   ReconcileError                machinehealthcheck/my-cluster-control-plane-unhealthy-5m   error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/my-cluster": context deadline exceeded
default     6m31s       Warning   ReconcileError                machinehealthcheck/my-cluster-md-0-unhealthy-5m            error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/my-cluster": context deadline exceeded
default     66s         Normal    DetectedUnhealthy             machine/my-cluster-control-plane-rb8qb                     Machine default/my-cluster-control-plane-unhealthy-5m/my-cluster-control-plane-rb8qb/ has unhealthy node

What could be the reason?

According to these docs, the secret is called hetzner https://github.com/syself/cluster-api-provider-hetzner/blob/main/docs/topics/preparation.md#create-a-secret-for-hcloud-only

I changed this, and used that:

kubectl create secret generic my-cluster-kubeconfig --from-literal=hcloud=$HCLOUD_TOKEN
kubectl patch secret my-cluster-kubeconfig -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}'

Now I get this error:

default     29s         Warning   ReconcileError                machinehealthcheck/my-cluster-md-0-unhealthy-5m            error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/my-cluster": failed to retrieve kubeconfig secret for Cluster default/my-cluster: secrets "my-cluster-kubeconfig" not found
default     29s         Warning   ReconcileError                machinehealthcheck/my-cluster-control-plane-unhealthy-5m   error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/my-cluster": failed to retrieve kubeconfig secret for Cluster default/my-cluster: secrets "my-cluster-kubeconfig" not found

Making controlPlaneLoadBalancer optional?

/kind feature

Describe the solution you'd like
Would it be possible to make the controlPlaneLoadBalancer parameter optional? There are some cases where I don't need the hetzner load balancer.

talos: machine deployment created too early

I'm running into a problem with the combination CAPH, CABPT, CACPPT where worker nodes end up in an endless reboot loop and never reach the Ready state.
My HetznerCluster resource has:

[...]
spec:
  controlPlaneEndpoint:
    host: ""
    port: 6443
[...]

Now, when creating a cluster with at least one worker node, only the control plane nodes are created with bootstrap data that contains a valid endpoint (load balancer IP as expected). Worker node(s) are missing the host:

kubectl --context kind-test get secret -l cluster.x-k8s.io/cluster-name=test -o json | jq -r '.items[] | select(.metadata.name | endswith("bootstrap-data")) | [.metadata.name, .data.value] | join(" ")' | while read name data; do echo -n "$name: "; base64 -d <<<$data | yq -r .cluster.controlPlane.endpoint; done
test-controlplane-6v8gn-bootstrap-data: https://XXX.XXX.XXX.XXX:6443
test-controlplane-qw2h6-bootstrap-data: https://XXX.XXX.XXX.XXX:6443
test-controlplane-tb9dz-bootstrap-data: https://XXX.XXX.XXX.XXX:6443
test-md-0-5f6c9b9447-gj6gt-bootstrap-data: https://:6443

My assumption is that the workers' bootstrap data is generated before CAPH replaces the empty spec.controlPlaneEndpoint.host value with the LB's external IP.
When creating the cluster with a worker count of 0 and scaling it up later, the bootstrap data is correct and the worker node(s) reach Ready. As it is now, they can never even join. I guess the workers are created too early as they don't depend on the LB creation. Not sure whether this is directly CAPH's fault, but would expect this dependency ordering to be a responsibility of the infrastructure provider. CABPT seems to just happily pick up the empty endpoint and only when booting the node it is flagged as invalid (Invalid controlplane endpoint: host must not be blank)

cluster with baremetal host waiting for available machine

/kind bug

I setup a cluster with BareMetalHost using hetzner-hcloud-control-planes flavor. My cluster template is edited to have 1 HCloudMachine for control plane, use CIDR bloks 10.243.0.0/16, zero replicas in g-work-md-0 that would contain HCloud machine and 1 replica in g-work-md-1 where bare metal is used.
When the template is applied the CRs are created, machine for control-plane is added, its load-balancer too. Bare-metal server is reinstalled and it is up and running but not added to cluster. I also deployed CNI and CCM (Hetzner version).

What did you expect to happen:

The machine should join the cluster and KubeadmConfigTemplate/g-work-md-1 should run installation process there. Now the cluster is waiting for the machine because it thinks it is unhealthy

$ clusterctl describe cluster g-work
NAME                                                                       READY  SEVERITY  REASON                       SINCE  MESSAGE                                                       
Cluster/g-work                                                             True                                          51m                                                                   
├─ClusterInfrastructure - HetznerCluster/g-work                                                                                                                                                
├─ControlPlane - KubeadmControlPlane/g-work-control-plane                  True                                          51m                                                                   
│ └─Machine/g-work-control-plane-4gvd4                                     True                                          52m                                                                   
│   └─MachineInfrastructure - HCloudMachine/g-work-control-plane-4fxsc                                                                                                                         
└─Workers                                                                                                                                                                                      
  ├─MachineDeployment/g-work-md-0                                          True                                          54m                                                                   
  └─MachineDeployment/g-work-md-1                                          False  Warning   WaitingForAvailableMachines  54m    Minimum availability requires 1 replicas, current 0 available  
    └─Machine/g-work-md-1-5bd54bfd6c-tzxql                                 True                                          45m                                                                   
      └─MachineInfrastructure - HetznerBareMetalMachine/g-work-md-1-gh6ks                                                                             

$ kubectl describe MachineHealthCheck g-work-md-1-unhealthy-5m
...
Status:
  Conditions:
    Last Transition Time:  2022-11-03T07:49:57Z
    Status:                True
    Type:                  RemediationAllowed
  Expected Machines:       1
  Observed Generation:     1
  Targets:
    g-work-md-1-5bd54bfd6c-tzxql
Events:
  Type     Reason          Age                 From                           Message
  ----     ------          ----                ----                           -------
  Warning  ReconcileError  60m (x16 over 60m)  machinehealthcheck-controller  error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/g-work": failed to retrieve kubeconfig secret for Cluster default/g-work: secrets "g-work-kubeconfig" not found
  Warning  ReconcileError  59m (x2 over 59m)   machinehealthcheck-controller  error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/g-work": client rate limiter Wait returned an error: context deadline exceeded - error from a previous attempt: EOF

$ kubectl describe secrets g-work-kubeconfig
Name:         g-work-kubeconfig
Namespace:    default
Labels:       caph.environment=owned
              cluster.x-k8s.io/cluster-name=g-work
Annotations:  <none>

Type:  cluster.x-k8s.io/secret

Data
====
value:  5535 bytes

$ kubectl describe hetznerbaremetalmachines g-work-md-1-gh6ks
Name:         g-work-md-1-gh6ks
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=g-work
              cluster.x-k8s.io/deployment-name=g-work-md-1
              machine-template-hash=1681069827
              nodepool=g-work-md-1
Annotations:  cluster.x-k8s.io/cloned-from-groupkind: HetznerBareMetalMachineTemplate.infrastructure.cluster.x-k8s.io
              cluster.x-k8s.io/cloned-from-name: g-work-md-1
              infrastructure.cluster.x-k8s.io/HetznerBareMetalHost: default/bm-0
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         HetznerBareMetalMachine
...
Status:
  Addresses:
    Address:  46.4.66.173/26
    Type:     InternalIP
    Address:  2a01:4f8:140:2484::2/64
    Type:     InternalIP
    Address:  g-work-md-1-gh6ks
    Type:     Hostname
    Address:  g-work-md-1-gh6ks
    Type:     InternalDNS
  Conditions:
    Last Transition Time:  2022-11-03T07:49:58Z
    Status:                True
    Type:                  AssociateBMHCondition
    Last Transition Time:  2022-11-03T07:49:58Z
    Status:                True
    Type:                  InstanceBootstrapReady
    Last Transition Time:  2022-11-03T07:55:34Z
    Status:                True
    Type:                  InstanceReady
  Last Updated:            2022-11-03T07:51:34Z
  Ready:                   true
Events:                    <none>

Anything else you would like to add:

Configs related to bare-metal server. There is Ubuntu 22.04 image, hostSelector to match the host through added label. The machine is up and running and reachable with provided SSH key. No k8s packages and/or processes there though.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: HetznerBareMetalMachineTemplate
metadata:
  name: g-work-md-1
  namespace: default
spec:
  template:
    spec:
      installImage:
        image:
          path: /root/images/Ubuntu-2204-jammy-amd64-base.tar.gz
          # path: /root/.oldroot/nfs/install/../images/Ubuntu-2004-focal-64-minimal-hwe.tar.gz
        partitions:
        - fileSystem: ext4
          mount: /boot
          size: 1024M
        - fileSystem: ext4
          mount: /
          size: all
        postInstallScript: |
          #!/bin/bash
          apt-get update && apt-get install -y cloud-init apparmor apparmor-utils
      hostSelector:
        matchExpressions:
        - key: "geneea/name"
          operator: "in"
          values:
          - "sigma"
          - "theta"
      sshSpec:
        portAfterCloudInit: 22
        portAfterInstallImage: 22
        secretRef:
          key:
            name: sshkey-name
            privateKey: ssh-privatekey
            publicKey: ssh-publickey
          name: robot-ssh
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: HetznerBareMetalHost
metadata:
  name: "bm-0"
  labels:
    geneea/name: "sigma"
spec:
  serverID: 1702753
  rootDeviceHints:
    raid:
      wwn:
        - "0x500a07511320714e"
        - "0x5002538d41d5de7c"
  maintenanceMode: false
  description: "sigma"

Environment:

  • cluster-api-provider-hetzner version: ccm-hetzner-1.1.4 installed to workload cluster, ccm-hcloud-1.0.11 in management cluster
  • Kubernetes version: (use kubectl version) 1.25.2
  • OS (e.g. from /etc/os-release):

Dependency Dashboard 🤖

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

  • 🌱 Update Github Actions group to v40.1.9

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

dockerfile
images/builder/Dockerfile
  • docker.io/library/alpine 3.19.1@sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0
  • docker.io/library/alpine 3.19.1@sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0
  • docker.io/hadolint/hadolint v2.12.0-alpine@sha256:7dba9a9f1a0350f6d021fb2f6f88900998a4fb0aaf8e4330aa8c38544f04db42
  • docker.io/aquasec/trivy 0.50.1@sha256:0aff831cd122c9cc8dbd25fc75974c21cd49ca7c72d522ce11978373f695f55d
  • docker.io/library/golang 1.21.5-bullseye@sha256:810dd3335e68f0b6ea802486fd0a027dda4013797b6fa58407f354244d9db2b7
images/cache/Dockerfile
  • docker.io/library/alpine 3.19.1@sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
  • docker.io/library/alpine 3.19.1@sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
images/caph/Dockerfile
  • docker.io/library/golang 1.21.6-bullseye@sha256:a8712f27d9ac742e7bded8f81f7547c5635e855e8b80302e8fc0ce424f559295
github-actions
.github/actions/e2e/action.yaml
  • actions/cache v4@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • actions/download-artifact v4@c850b930e6ba138125429b7e5c93fc707a7f8427
  • hetznercloud/tps-action dee5dd2546322c28ed8f74b910189066e8b6f31a
  • actions/upload-artifact v4@5d5d22a31266ced268874388b861e4b58bb5c2f3
.github/actions/manager-image/action.yaml
  • docker/setup-buildx-action v3.3.0@d70bba72b1f3fd22344832f00baa16ece964efeb
  • docker/login-action v3.1.0@e92390c5fb421da1463c202d546fed0ec5c39f20
  • actions/cache v4.0.2@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • docker/build-push-action v5.3.0@2cdde995de11925a030ce8070c3d77a52ffcf1c0
  • docker/build-push-action v5@2cdde995de11925a030ce8070c3d77a52ffcf1c0
.github/actions/metadata/action.yaml
  • docker/metadata-action v5.5.1@8e5442c4ef9f78752691e2d8f8d19755c6f78e81
.github/actions/setup-go/action.yaml
  • actions/setup-go v5.0.0@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
  • actions/cache v4@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • actions/cache v4@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
.github/actions/test-release/action.yaml
  • actions/cache v4@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • actions/upload-artifact v4@5d5d22a31266ced268874388b861e4b58bb5c2f3
.github/workflows/build.yml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • docker/setup-qemu-action v3@68827325e0b33c7199eb31dd4e31fbe9023e06e3
  • docker/setup-buildx-action v3@d70bba72b1f3fd22344832f00baa16ece964efeb
  • docker/login-action v3.1.0@e92390c5fb421da1463c202d546fed0ec5c39f20
  • sigstore/cosign-installer v3.4.0@e1523de7571e31dbe865fd2e80c5c7c23ae71eb4
  • actions/cache v4.0.2@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • docker/build-push-action v5.3.0@2cdde995de11925a030ce8070c3d77a52ffcf1c0
  • docker/build-push-action v5@2cdde995de11925a030ce8070c3d77a52ffcf1c0
  • actions/upload-artifact v4.3.1@5d5d22a31266ced268874388b861e4b58bb5c2f3
  • docker/build-push-action v5.3.0@2cdde995de11925a030ce8070c3d77a52ffcf1c0
.github/workflows/e2e-basic-baremetal.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-basic-packer.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-basic.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-feature-baremetal.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-feature.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-periodic.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-upgrade-caph.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/e2e-upgrade-kubernetes.yaml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/main-promote-builder-image.yml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • ghcr.io/syself/caph-builder 1.0.16
.github/workflows/pr-e2e.yaml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
.github/workflows/pr-lint.yml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • ghcr.io/syself/caph-builder 1.0.16
.github/workflows/pr-verify.yml
  • kubernetes-sigs/kubebuilder-release-tools v0.4.3@012269a88fa4c034a0acf1ba84c26b195c0dbab4
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/setup-node v4@60edb5dd545a775178f52524783378180af0d1f8
  • actions/create-github-app-token v1@7bfa3a4717ef143a604ee0a99d859b8886a96d00
  • pascalgn/size-label-action v0.5.0@37a5ad4ae20ea8032abf169d953bcd661fd82cd3
  • actions/labeler v5@8558fd74291d67161a8a78ce36a881fa63b766a9
  • EndBug/label-sync v2@52074158190acb45f3077f9099fea818aa43f97a
.github/workflows/release.yml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • docker/setup-qemu-action v3@68827325e0b33c7199eb31dd4e31fbe9023e06e3
  • docker/setup-buildx-action v3@d70bba72b1f3fd22344832f00baa16ece964efeb
  • docker/login-action v3.1.0@e92390c5fb421da1463c202d546fed0ec5c39f20
  • sigstore/cosign-installer v3.4.0@e1523de7571e31dbe865fd2e80c5c7c23ae71eb4
  • docker/build-push-action v5@2cdde995de11925a030ce8070c3d77a52ffcf1c0
  • actions/upload-artifact v4.3.1@5d5d22a31266ced268874388b861e4b58bb5c2f3
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/setup-go v5.0.0@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
  • softprops/action-gh-release v2@9d7c94cfd0a1f3ed45544c887983e9fa900f0564
.github/workflows/report-bin-size.yml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/setup-go v5.0.0@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
  • actions/upload-artifact v4.3.1@5d5d22a31266ced268874388b861e4b58bb5c2f3
.github/workflows/schedule-cache-cleaner-caph-image.yml
  • actions/cache v4.0.2@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9
  • ubuntu 22.04
.github/workflows/schedule-scan-image.yml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • ghcr.io/syself/caph-builder 1.0.16
.github/workflows/schedule-update-bot.yaml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/create-github-app-token v1@7bfa3a4717ef143a604ee0a99d859b8886a96d00
  • renovatebot/github-action v40.1.7@7d358366277001f3316d7fa54ff49a81c0158948
.github/workflows/test.yml
  • actions/checkout v4.1.1@b4ffde65f46336ab88eb53be808477a3936bae11
  • actions/setup-go v5.0.0@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
  • test-summary/action v2.3@032c8a9cec6aaa3c20228112cae6ca10a3b29336
  • actions/upload-artifact v4.3.1@5d5d22a31266ced268874388b861e4b58bb5c2f3
gomod
go.mod
  • go 1.21
  • github.com/blang/semver/v4 v4.0.0
  • github.com/go-logr/logr v1.4.1
  • github.com/go-logr/zapr v1.3.0
  • github.com/hetznercloud/hcloud-go/v2 v2.7.0
  • github.com/onsi/ginkgo/v2 v2.17.1
  • github.com/onsi/gomega v1.32.0
  • github.com/spf13/pflag v1.0.5
  • github.com/stretchr/testify v1.9.0
  • github.com/syself/hrobot-go v0.2.5
  • go.uber.org/zap v1.27.0
  • golang.org/x/crypto v0.22.0
  • golang.org/x/exp v0.0.0-20240409090435-93d18d7e34b8@93d18d7e34b8
  • golang.org/x/mod v0.17.0
  • k8s.io/klog/v2 v2.120.1
  • k8s.io/utils v0.0.0-20240310230437-4693a0247e57@4693a0247e57
  • sigs.k8s.io/controller-runtime v0.16.5
  • sigs.k8s.io/kind v0.22.0
regex
templates/cluster-templates/bases/hcloud-kcp-ubuntu.yaml
  • containerd/containerd 1.7.15
templates/cluster-templates/bases/hetznerbaremetal-kcp-ubuntu.yaml
  • containerd/containerd 1.7.15
templates/cluster-templates/bases/kct-md-0-ubuntu.yaml
  • containerd/containerd 1.7.15
images/builder/Dockerfile
  • lycheeverse/lychee v0.14.3
  • golangci/golangci-lint v1.57.2
  • debian_11/skopeo 1.2.2+dfsg1-1+b6
  • adrienverge/yamllint v1.35.1
  • opt-nc/yamlfixer 0.9.15

This issue tracks Hetzner dedicated servers coming soon Proposal

Love what this allows us to do on hcloud.

I am particularly excited to see Hetzner dedicated servers coming soon. Creating this issue to track that discussion/work. Hope this is OK.

There's already an exciting startup that's running Kubernetes on Hetzner root/dedicated servers. Here's their roadmap: https://docs.symbiosis.host/about/roadmap

I am unsure how close their OSS contributions brings us closer to running Kubernetes on Hetzner root/dedicated servers

I sketched out a way how I would go about it: https://www.reddit.com/r/hetzner/comments/ttm3n7/how_do_you_manage_hetzner_robot_resources_like/i3d0g8w/

Support loadbalancer for ingress-nginx

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]
Hi, I've been following this tutorial
and have created manually a loadbalancer with two service pointing to both ports 80 and 443 with ssl-passthrough, because ingress-nginx should be terminating that.

Is it possible to allow the creation of a loadbalancer additionally to the already provisioned controlPlaneLoadBalancer?

Do I need a regular floating ip that ingress-nginx is able to pick that up or use something like metallb for multiple ones?

Thanks in advance!

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
I'll definitely try out your tilt setup next
https://github.com/syself/cluster-api-provider-hetzner/blob/main/docs/developers/development.md#tilt-for-dev-in-caph
to be able to incorporate your changes sooner.

Environment:

  • cluster-api-provider-hetzner version: v1.1.1 (This new version works with provisioning too)
  • Kubernetes version: (use kubectl version) 1.23.3
  • OS (e.g. from /etc/os-release): fedora-35

How to Guide for Talos

/kind feature

Describe the solution you'd like
First and foremost thank you very much for this great work!

I'd like to ask for a Guideline / How to Guide to bootstrap the cluster with Talos as I'm sure many people would appreciate and favor it over standard linux distros like e.g. ubuntu.

Support autoscaling from zero

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

Add support for autoscaling from zero defined in https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20210310-opt-in-autoscaling-from-zero.md

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Just implemented upstream in kubernetes/autoscaler#4840

See also:

IPv6 Only/Dual Stack support?

/kind feature

Describe the solution you'd like
Hetzner recently added a feature to allow IPv6-only VMs on HCloud. IPv4s are now considered an extra cost for each instance.
Is there currently a way to do IPv6 Only or Dual Stack deployments? I haven't seen any documentation on this anywhere so I assume not.

Talos github organization was renamed from `talos-systems` to `siderolabs`

Just wanted to point out that the github URLs in this repo still point to the old organization.
This organization was renamed a few months ago due to legal reasons, so it would be wise to update the URLs in this repo as well :)

These are the references I could find. Simply replace github.com/talos-systems with github.com/siderolabs

"IMAGE_URL=https://github.com/talos-systems/talos/releases/download/{{user `talos_version`}}/hcloud-amd64.raw.xz"

cabpt_uri = "https://github.com/talos-systems/cluster-api-bootstrap-provider-talos/releases/download/{}/bootstrap-components.yaml".format(version)

cacppt_uri = "https://github.com/talos-systems/cluster-api-control-plane-provider-talos/releases/download/{}/control-plane-components.yaml".format(version)

Fix usage of webhooks in unit tests with envtest

/kind bug

What steps did you take and what happened:
I tried to include unit tests in the /controllers folder that test the functionality of webhooks. However, as far as I can see, no webhooks are used at any stage and therefore the validation of webhooks does not work at all in our current unit tests. I don't know why, as the webhook port should be set correctly.

I tried to disable the renaming happening in the function appendWebhookConfiguration, but it didn't change anything.

What did you expect to happen:
Webhook server should be started in NewTestEnvironment (which is the case currently, as we check this in WaitForWebhooks) and the webhooks should be triggered when a relevant object is created or updated.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-alpha.10
  • Kubernetes version: 1.22
  • OS: Fedora 34

Improve /docs/reference/hetzner-bare-metal-machine-template.md

Describe the solution you'd like
/docs/reference/hetzner-bare-metal-machine-template.md and especially template.spec.hostSelector could probably use some references/examples.

Should we use GET/server object values from hetzner webservice reference for keys in the template.spec.hostSelector.matchExpressions.key ?

Anything else you would like to add:
I am also wondering whether there should be a cluster-template for HetznerBareMetalHost to make it clearer.

/kind proposal

NameServer limits were exceeded

/kind bug

What steps did you take and what happened:
I've followed the quickstart and setup a
kind HA management cluster with target Hetzner:

clusterctl init --core cluster-api --bootstrap kubeadm --control-plane kubeadm --infrastructure hetzner

# Make local cluster reliable
kubectl -n capi-system scale deployment capi-controller-manager --replicas=2
...

export HCLOUD_TOKEN="psst" \
export SSH_KEY="home-computer" \
export HCLOUD_IMAGE_NAME=fedora-35 \
export CLUSTER_NAME="my-cluster" \
export REGION="fsn1" \
export CONTROL_PLANE_MACHINE_COUNT=1\
export WORKER_MACHINE_COUNT=2 \
export KUBERNETES_VERSION=1.23.3 \
export HCLOUD_CONTROL_PLANE_MACHINE_TYPE=cpx21 \
export HCLOUD_NODE_MACHINE_TYPE=cpx31

# API secret
kubectl create secret generic hetzner --from-literal=hcloud=$HCLOUD_TOKEN
kubectl patch secret hetzner -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}'

# Notice v1.23.3 and flavor hcloud-network
clusterctl generate cluster my-cluster --kubernetes-version v1.23.3 --control-plane-machine-count=1 --worker-machine-count=2 --flavor hcloud-network > my-cluster.yaml
# Cluster start
kubectl apply -f my-cluster.yaml

# -> Initiated!
kubectl get kubeadmcontrolplane

# Set hetzner kubeconfig as active
clusterctl get kubeconfig my-cluster > $PWD/hetzner.kubeconfig
export KUBECONFIG=$PWD/hetzner.kubeconfig

I received an error similar to this:
kubernetes/kubernetes#82756

# Temporary fix by removing 2 unused IPv6 entries
ssh root@all-servers
# remove 2 lines after 3 nameservers

vi /etc/resolv.conf
sudo resolvconf -a -d eth0

What did you expect to happen:
That fedora 35 wouldn't use unused nameserver entries.

Anything else you would like to add:
Great project so far!
Is it possible to make use of the cleanup in a kubeadm cloud-init script through systemd-resolvd?

Environment:

  • cluster-api-provider-hetzner version: [v1.0.0-alpha.9]
  • Kubernetes version: 1.23.3
  • OS fedora-35

bm-node routing broken after cloud-init (wrong gateway, duplicate routes)

/kind bug
/lifecycle active

What steps did you take and what happened:

  • Create a three-node cluster with flavor hetzner-baremetal-control-planes-remediation / Ubuntu 20.04 HWE.
  • Nodes with IPs within the same subnet – in my case 94.130.10.24 and 94.130.10.27 – won't be able to communicate with each other due to a wrong gateway route-setting:

=> non-working routing after HetznerBareMetalHost has finished rebooting and trying to join the cluster:
non-working

Pinging between the nodes by ssh-ing to either machine (first controlplane and second) does not work Destination Host Unreachable.

routing looks like this in rescue-mode:
working-rescue

This can be fixed manually by replacing the wrong gateway (0.0.0.0):

route del -net 94.130.10.0 gw 0.0.0.0 netmask 255.255.255.192 enp35s0
route add -net 94.130.10.0 gw 94.130.10.1 netmask 255.255.255.192 enp35s0

also route del -net 0.0.0.0 gw 94.130.10.1 netmask 0.0.0.0 enp35s0 as the first route is set-up twice.

What did you expect to happen:

Correct gateway in routing, no duplicate routes.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-beta.7
  • Kubernetes version: v1.24.8
  • OS: Ubuntu 20.04 HWE

CAPH-controller – secret has been modified although a provisioned machine uses it

/kind bug

What steps did you take and what happened:

  • Create a three-node cluster with flavor hetzner-baremetal-control-planes-remediation / Ubuntu 20.04 HWE using a fresh kind-bootstrap-cluster.
  • Moving objects to target cluster via clusterctl move
  • all bm-nodes stay in error-state as moving the secrets seems to alter them
  • the controller reports:

image

What did you expect to happen:

Moving the secrets from the bootstrap to the target cluster seems to alter them which should be "communicated" to the caph-controller (if so).

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-beta.7
  • Kubernetes version: v1.24.8
  • OS: Ubuntu 20.04 HWE

Private network for bare metal host / cloud hybrid

/kind feature

Describe the solution you'd like
It is possible to link bare metal servers to the same private network as the cloud servers. I think it would be good to have an option for the private networking for hybrid clusters.

Anything else you would like to add:
More info: https://docs.hetzner.com/cloud/networks/connect-dedi-vswitch/

Adding the vswitch is possible with the robot web service:
https://robot.your-server.de/doc/webservice/en.html#post-vswitch

imageName in HCloudMachine doesnt work with snapshots

/kind bug

[Before submitting an issue, have you checked the Troubleshooting Guide]
Yes
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
I have created a simple hcloudMachineTemplate. it has an imageName field. this name doesn't work with snapshots since snapshots in hcloud don't have a name. Snapshots are only created with a description, id or optional labels.

infrastructureprovider controller throws the following error if and id, description or labels are given in imageName field.

{"level":"ERROR","time":"2022-03-19T01:08:55.839Z","logger":"controller.hcloudmachine","file":"controller/controller.go:317","message":"Reconciler error","reconciler group":"infrastructure.cluster.x-k8s.io","reconciler kind":"HCloudMachine","name":"my-cluster-control-plane-mwtmn","namespace":"hcloud","error":"failed to reconcile server for HCloudMachine hcloud/my-cluster-control-plane-mwtmn: failed to create server: failed to get server image: no image found with name testImage","errorVerbose":"no image found with name testImage\nfailed to get server image\ngithub.com/syself/cluster-api-provider-hetzner/pkg/services/hcloud/server.(*Service).createServer\n\t/workspace/pkg/services/hcloud/server/server.go:175\ngithub.com/syself/cluster-api-provider-hetzner/pkg/services/hcloud/server.(*Service).Reconcile\n\t/workspace/pkg/services/hcloud/server/server.go:75\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).reconcileNormal\n\t/workspace/controllers/hcloudmachine_controller.go:185\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).Reconcile\n\t/workspace/controllers/hcloudmachine_controller.go:146\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\nfailed to create server\ngithub.com/syself/cluster-api-provider-hetzner/pkg/services/hcloud/server.(*Service).Reconcile\n\t/workspace/pkg/services/hcloud/server/server.go:77\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).reconcileNormal\n\t/workspace/controllers/hcloudmachine_controller.go:185\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).Reconcile\n\t/workspace/controllers/hcloudmachine_controller.go:146\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\nfailed to reconcile server for HCloudMachine hcloud/my-cluster-control-plane-mwtmn\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).reconcileNormal\n\t/workspace/controllers/hcloudmachine_controller.go:186\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(*HCloudMachineReconciler).Reconcile\n\t/workspace/controllers/hcloudmachine_controller.go:146\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:317\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}

What did you expect to happen:
Image should be selectable using an Id, description or labels. It is a common usecase to pre-build kubernetes images and use them with clusterapi instead of using default system images and doing installations during image provisioning.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • cluster-api-provider-hetzner version:
  • Kubernetes version: (use kubectl version)
  • OS (e.g. from /etc/os-release):

Public IPv4 network settings for machine templates apply incorrectly

/kind bug

After discovering the functionality to disable public network access does exist, I gave it a try and found it was not applying the configuration correctly. See the comments below.

Also there is a typo in the documentation.

Original Request for Optional Public IPv4 addresses for nodes

IPv4 addresses are optional on Hetzner instances and are not strictly necessary when nodes are only intended to be accessible within the cluster and/or external access is always done in front of a load balancer. Alternatively, one might want to exclusively use IPv6 for external access.

It would be nice to recover the expense of IPv4 addresses for pools of nodes that do not strictly need those addresses.

Environment:

  • cluster-api-provider-hetzner version: 1.0.0-beta.3
  • Kubernetes version: 1.25.2
  • OS: N/A

controlplane bootstrap fails/stuck after first controlplane node

/kind bug

What steps did you take and what happened:
i tried to deploy a k8s cluster by using the Hetzner capi provider but the control plane is not able to get healthy.
the deployment stuck after the first control plane.

my steps:

clusterctl generate provider --infrastructure hetzner:v1.0.0-alpha.20 > hetzner-capi.yml
kubectl apply -f hetzner-capi.yml

export HCLOUD_TOKEN=xxxxxxxxxxxx
export HCLOUD_SSH_KEY="mykey"
export CLUSTER_NAME="mycluster"
export HCLOUD_REGION="fsn1"
export CONTROL_PLANE_MACHINE_COUNT=3
export WORKER_MACHINE_COUNT=3
export KUBERNETES_VERSION=1.24.1
export HCLOUD_CONTROL_PLANE_MACHINE_TYPE=cpx31
export HCLOUD_WORKER_MACHINE_TYPE=cpx31

kubectl create secret generic hetzner --from-literal=hcloud=${HCLOUD_TOKEN}
kubectl patch secret hetzner-p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}'
clusterctl generate cluster --infrastructure hetzner:v1.0.0-alpha.20 ${CLUSTER_NAME} > ${CLUSTER_NAME}.yaml
kubectl apply -f ${CLUSTER_NAME}.yaml

cluster is created but the controlplane is not able to get healthy:

$ clusterctl describe cluster ${CLUSTER_NAME}
NAME                                                                       READY  SEVERITY  REASON                       SINCE  MESSAGE                                                                  
Cluster/mycluster                                                          False  Warning   ScalingUp                    71m    Scaling up control plane to 3 replicas (actual 1)                        
├─ClusterInfrastructure - HetznerCluster/mycluster                                                                                                                                                       
├─ControlPlane - KubeadmControlPlane/mycluster-control-plane               False  Warning   ScalingUp                    71m    Scaling up control plane to 3 replicas (actual 1)                        
│ └─Machine/mycluster-control-plane-x5jb5                                  False  Warning   NodeStartupTimeout           49m    Node failed to report startup in &Duration{Duration:20m0s,}              
│   └─MachineInfrastructure - HCloudMachine/mycluster-control-plane-prxwq                                                                                                                                
└─Workers                                                                                                                                                                                                
  └─MachineDeployment/mycluster-md-0                                       False  Warning   WaitingForAvailableMachines  72m    Minimum availability requires 3 replicas, current 0 available            
    └─3 Machines...                                                        True                                          7m53s  See mycluster-md-0-59f5696b48-khjkp, mycluster-md-0-59f5696b48-v57kg, ...

$ kubectl get KubeadmControlPlane
NAME                          CLUSTER         INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
mycluster-control-plane       mycluster       true                                 1                  1         1             73m   v1.24.1

$ kubectl describe KubeadmControlPlane mycluster-control-plane
.....
Events:
  Type     Reason                 Age                    From                              Message
  ----     ------                 ----                   ----                              -------
  Warning  ControlPlaneUnhealthy  3m28s (x285 over 73m)  kubeadm-control-plane-controller  Waiting for control plane to pass preflight checks to continue reconciliation: [machine mycluster-control-plane-x5jb5 does not have APIServerPodHealthy condition, machine mycluster-control-plane-x5jb5 does not have ControllerManagerPodHealthy condition, machine mycluster-control-plane-x5jb5 does not have SchedulerPodHealthy condition, machine mycluster-control-plane-x5jb5 does not have EtcdPodHealthy condition, machine mycluster-control-plane-x5jb5 does not have EtcdMemberHealthy condition]

$ kubectl get MachineHealthCheck
NAME                                       CLUSTER         EXPECTEDMACHINES   MAXUNHEALTHY   CURRENTHEALTHY   AGE
mycluster-control-plane-unhealthy-5m       mycluster       1                  100%                            74m
mycluster-md-0-unhealthy-5m                mycluster       3                  100%                            74m

$ kubectl describe MachineHealthCheck mycluster-control-plane-unhealthy-5m
Events:
  Type     Reason          Age                 From                           Message
  ----     ------          ----                ----                           -------
  Warning  ReconcileError  75m (x13 over 75m)  machinehealthcheck-controller  error creating client and cache for remote cluster: error fetching REST client config for remote cluster "default/mycluster": failed to retrieve kubeconfig secret for Cluster default/mycluster: secrets "mycluster-kubeconfig" not found
  Warning  ReconcileError  74m                 machinehealthcheck-controller  error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/mycluster": Get "https://142.132.240.114:443/api?timeout=10s": dial tcp 142.132.240.114:443: i/o timeout
  Warning  ReconcileError  73m (x4 over 74m)   machinehealthcheck-controller  error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "default/mycluster": context deadline exceeded

$ kubectl get secrets
NAME                                TYPE                                  DATA   AGE
default-token-vcccj                 kubernetes.io/service-account-token   3      5h35m
hetzner                             Opaque                                1      78m
mycluster-ca                        cluster.x-k8s.io/secret               2      76m
mycluster-control-plane-xkvnf       cluster.x-k8s.io/secret               2      76m
mycluster-etcd                      cluster.x-k8s.io/secret               2      76m
mycluster-kubeconfig                cluster.x-k8s.io/secret               1      76m
mycluster-md-0-48wdl                cluster.x-k8s.io/secret               2      12m
mycluster-md-0-rlccb                cluster.x-k8s.io/secret               2      13m
mycluster-md-0-x2q8s                cluster.x-k8s.io/secret               2      12m
mycluster-proxy                     cluster.x-k8s.io/secret               2      76m
mycluster-sa                        cluster.x-k8s.io/secret               2      76m

# Get the node status of the deployed cluster
$ kubectl get no --kubeconfig mycluster
NAME                            STATUS   ROLES              AGE    VERSION
mycluster-control-plane-x5jb5   NotReady    control-plane   74m    v1.24.1
mycluster-md-0-khjkp            NotReady    <none>          72m    v1.24.1
mycluster-md-0-v57kg            NotReady    <none>          72m    v1.24.1
mycluster-md-0-x5d32            NotReady    <none>          72m    v1.24.1

# Try to fetch API - possible
$ curl https://142.132.240.114:443/api?timeout=10s -k
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/api\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}

If tested hetzner:v1.0.0-alpha.19 and hetzner:v1.0.0-alpha.20 but i get the same result.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-alpha.19 and v1.0.0-alpha.20
  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.6", GitCommit:"42a9a90338d705a1650fb68b7891f84b62adb0b0", GitTreeState:"clean", BuildDate:"2022-06-15T04:25:21Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
  • OS:
    • Server image: ubuntu 20.04
    • client: OSX

using on 3x physical servers and sw raid 1?

/kind feature

Describe the solution you'd like
We are running k8s clusters, run up by puppet currently, which are setup like this:
software raid 1 on nvme disks (fast but smaller) and on hdd disks (slow but large)
and we put OS on a small seperate sw raid1 on HDD - and rest is just unpartitioned disk, grabbed by Ceph.
we then setup rook/ceph to use HDD (with 1gb interface, fast disks is not gonna work anyways for ceph) - and then we setup localzfs storageclass for nvme disks, and use those for HA postgres-operator clusters (with HA in operator it handles failover and data sync in app - so local disk is best and fast).
We then use the hetzner docker image, that manages a floating ip - and points it to the node on which it is running - creating a HA bare metal cluster, where floating ip is simply moved to diff. node - if active ingress server dies
.
I am trying to figure out, how to use this to roll out such a cluster setup on 3 new bare metal servers with hetzner.. ?

ARM compatible image

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

Make image Multi-Arch / ARM compatible

Environment:

  • OS (e.g. from /etc/os-release): macOS 12.2.1 / Raspberry Pi OS (64-bit)

If Loadbalancer is protected do not try to delete

/kind bug

If a Loadbalancer is protected and cannot be deleted by the controller the system will try at the moment forever. We should check on deletion if the lb is protected and then skip the deletion (maybe also throwing an error or event).

Breaking changes documentation

First off, thanks a bunch for the amazing work on this cluster api provider!
We're using alpha.13 for our development clusters, at so far it's been pretty awesome in tandem with talos!
I was considering using this to spin up a new production cluster, however I saw that there are some Breaking changes from .16 to .17, but I wasn't able to see what was needed to be done in order to migrate from an older version from .13 to .17 :-)

Is there any way to figure out what the breaking change is?

Thanks!

Upgrade controlplane to a more powerful node

/kind bug

Hi, I am asking myself, what would be the best way to upgrade the control plane server?
cx21 -> cx31

I've created a helm chart for this:

Before

control_plane_machine_count: 1
hcloud_control_plane_machine_type: cx21

After

control_plane_machine_count: 1
hcloud_control_plane_machine_type: cx31

But upgrade of the helm chart had not the intended effect of migrating everything to a cx31 server.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-alpha.11
  • Kubernetes version: (use kubectl version) 1.23.4
  • OS (e.g. from /etc/os-release): fedora-35

Please make SSH keys optional for clean a Talos setup

/kind feature

Describe the solution you'd like
Specification of SSH Keys should be optional

Anything else you would like to add:
Talos doesn't provide shell access so there is no point in configuring an SSH key for Talos based nodes. As it is currently, I have to configure a dummy key only because of https://github.com/syself/cluster-api-provider-hetzner/blob/main/pkg/services/hcloud/server/server.go#L233

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-alpha.14

error during placement (resource_unavailable)

/kind bug

What steps did you take and what happened:
I ran today into the issue that a server is not created. i got the following error:

{"level":"ERROR","time":"2022-02-22T10:12:50.710Z","logger":"controller.hcloudmachine","file":"controller/controller.go:317","message":"Reconciler error","reconciler group":"infrastructure.cluster.x-k8s.io","reconciler kind":"HCloudMachine","name":"test-control-plane-b8a01-f65ww","namespace":"cluster","error":"failed to reconcile server for HCloudMachine cluster/test-control-plane-b8a01-f65ww: failed to create server: error while creating HCloud server test-control-plane-b8a01-f65ww: error during placement (resource_unavailable)","errorVerbose":"error while creating HCloud server test-control-plane-b8a01-f65ww: error during placement (resource_unavailable)\nfailed to create server\ngithub.com/syself/cluster-api-provider-hetzner/pkg/services/hcloud/server.(Service).Reconcile\n\t/workspace/pkg/services/hcloud/server/server.go:78\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(HCloudMachineReconciler).reconcileNormal\n\t/workspace/controllers/hcloudmachine_controller.go:180\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(HCloudMachineReconciler).Reconcile\n\t/workspace/controllers/hcloudmachine_controller.go:141\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\nfailed to reconcile server for HCloudMachine cluster/test-control-plane-b8a01-f65ww\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(HCloudMachineReconciler).reconcileNormal\n\t/workspace/controllers/hcloudmachine_controller.go:181\ngithub.com/syself/cluster-api-provider-hetzner/controllers.(HCloudMachineReconciler).Reconcile\n\t/workspace/controllers/hcloudmachine_controller.go:141\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:317\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}

What did you expect to happen:
The server should be created.

Anything else you would like to add:
It looks like that the controller is not requeuing after the error. Workaround is to restart the controller.

Environment:

  • cluster-api-provider-hetzner version: v1.0.0-alpha.10
  • Kubernetes version: v1.23.3
  • OS: fedora-35

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.