Git Product home page Git Product logo

Comments (10)

fmunteanu avatar fmunteanu commented on August 25, 2024

As reference, these settings work quite well on my cluster. On this Kubernetes blog there is a configuration error related to evictionMaxPodGracePeriod, you're supposed to pass an int value, not string. See correct reference.

Ideally, I would like to use a Kubelet configuration file, for settings editing flexibility.

from k3s.

brandond avatar brandond commented on August 25, 2024

We don't modify any of the code related to the kubelet's parsing of the config file, so it's unclear to me what part of this is a k3s issue. Have you tried turning up the kubelet's verbosity with --v=9 to see if it is failing to parse some bit of your config file?

from k3s.

fmunteanu avatar fmunteanu commented on August 25, 2024

@brandond I have no idea how to set the kubelet verbosity with K3s, can you please let me know, so I can report back the findings? From my perspective, the config argument is passed through kubelet-arg, therefore is directly related to K3s.

from k3s.

brandond avatar brandond commented on August 25, 2024

turning up the kubelet's verbosity with --v=9

I have no idea how to set the kubelet verbosity with K3s

I said how. Just add that flag to the k3s args, or v: 9 if you're using a k3s config.yaml.

the config argument is passed through kubelet-arg, therefore is directly related to K3s.

We just pass that arg to the kubelet. The kubelet's actual parsing of that file is not part of our code base, so if there is an issue with it, it'd need to be fixed in Kubernetes, not K3s.

from k3s.

fmunteanu avatar fmunteanu commented on August 25, 2024

@brandond I found the root cause of the curent issue and IMO is a quite important one. Let me explain the problem, below.

When K3s service is started on a Linux environment, it passes a list of flags to kubelet:

# kubectl cordon apollo
# journalctl -u k3s --rotate --vacuum-time=1s
# systemctl restart k3s
# journalctl -u k3s | grep 'msg="Running kubelet' | sed 's| --|\n  --|g'
Jun 25 19:03:01 apollo k3s[22973]: time="2024-06-25T19:03:01-04:00" level=info msg="Running kubelet
  --address=0.0.0.0
  --anonymous-auth=false
  --authentication-token-webhook=true
  --authorization-mode=Webhook
  --cgroup-driver=systemd
  --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt
  --cluster-dns=10.43.0.10
  --cluster-domain=cluster.local
  --config=/etc/rancher/k3s/kubelet.yaml
  --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock
  --containerd=/run/k3s/containerd/containerd.sock
  --eviction-hard=imagefs.available<5%,nodefs.available<5%
  --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10%
  --fail-swap-on=false
  --healthz-bind-address=127.0.0.1
  --hostname-override=apollo
  --kubeconfig=/var/lib/rancher/k3s/agent/kubelet.kubeconfig
  --node-ip=192.168.4.2
  --node-labels=
  --pod-infra-container-image=rancher/mirrored-pause:3.6
  --pod-manifest-path=/var/lib/rancher/k3s/agent/pod-manifests
  --read-only-port=0
  --register-with-taints=node.cilium.io/agent-not-ready:NoExecute,node-role.kubernetes.io/control-plane:NoSchedule
  --resolv-conf=/run/systemd/resolve/resolv.conf
  --serialize-image-pulls=false
  --tls-cert-file=/var/lib/rancher/k3s/agent/serving-kubelet.crt
  --tls-private-key-file=/var/lib/rancher/k3s/agent/serving-kubelet.key"

Note the 3 key arguments:

  --config=/etc/rancher/k3s/kubelet.yaml
  --eviction-hard=imagefs.available<5%,nodefs.available<5%
  --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10%

And less important one:

  --serialize-image-pulls=false

Since the command arguments take precedence of arguments passed through kubelet.yaml, it explains why user-defined evictionHard and evictionMinimumReclaim values are ignored. Passing the --eviction-hard and --eviction-minimum-reclaim flags at service start is really bad, because it tells to kubelet to ignore the default settings:

Default: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" imagefs.inodesFree: "5%"

Setting only these two arg/values will automatically ignore the memory.available, nodefs.inodesFree and imagefs.inodesFree values:

  --eviction-hard=imagefs.available<5%,nodefs.available<5%
  --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10%

And can result on a major performance impact, since kubelet ignores the non-defined hard evictions:

By changing the default value of only one parameter for evictionHard, the default values of other parameters will not be inherited and will be set to zero. In order to provide custom values, you should provide all the threshold values respectively.

Say, you have a bad application with no limits defined and memory leak, it will quickly fill the node memory (since the memory.available garbage collection is not triggered), until kubelet crashes.

We need to have these two flags removed at service start. This way, kubelet can properly set the default garbage collection values, which protects the cluster from catastrophic failures. These two lines should be removed:

https://github.com/k3s-io/k3s/blob/master/pkg/daemons/agent/agent_linux.go#L79-L80
https://github.com/k3s-io/k3s/blob/master/pkg/daemons/agent/agent_windows.go#L52-L53

Related to --serialize-image-pulls flag, I see why you set it to false:

https://github.com/k3s-io/k3s/blob/master/pkg/daemons/agent/agent_linux.go#L104

I traced this behaviour all the way to K3s 1.22, see #4509 (comment).

from k3s.

fmunteanu avatar fmunteanu commented on August 25, 2024

@brandond I created a PR with the proposed fix.

from k3s.

brandond avatar brandond commented on August 25, 2024

I would consider our support for kubelet configuration files somewhat limited until we address that issue, due to how the kubelet handles configuration passed via both config AND cli.

from k3s.

fmunteanu avatar fmunteanu commented on August 25, 2024

That makes sense @brandond, any idea when this is planned to be pushed to stable release? Just to make sure the context is clear, this issue has two problems to be addressed:

  • Fix the broken garbage collection, by removing the hardcoded flags mentioned above, see PR
  • Migrating to a kubelet configuration file will implicitly address the hard-coded flag priority we deal with now, see PR

from k3s.

brandond avatar brandond commented on August 25, 2024

Upstream default: imagefs.available<15%,memory.available<100Mi,nodefs.available<10%,nodefs.inodesFree<5%
Our default: imagefs.available<5%,nodefs.available<5%

  1. We are not currently planning on changing the default eviction behavior, as it has been working for our users/customers for the last 6 years without a single complaint prior to yours. I don't have the full context on these customizations, but my understanding is that these defaults were intended to make k3s more usable on small nodes (ie early raspberry pi with 512MB of memory and SD card storage) where low free memory and inode count may be an expected condition - similar to how k3s has always allowed the kubelet to run with swap enabled, which upstream Kubernetes did not.
    We will improve the ability to alter/override our defaults, but that is as much as I am willing to commit to at the moment.
  2. We don't currently have a target date for switching to kubelet config instead of CLI args. Upstream shows no signs of actually removing CLI-based configuration, despite it being deprecated (and deprecated for only the kubelet, not any other component) for several years.

from k3s.

fmunteanu avatar fmunteanu commented on August 25, 2024

Thank you @brandond for taking the time to discuss it with the devs, internally. At the end of the line, if end-user knows where is the issue, he/she knows how to fix it. I really appreciate you spending time on this.

Like you mentioned earlier, the forced kubelet settings can be fixed by passing new args to override the defaults you have hardcoded. I was simply trying to reinforce the defaults presented in K8s.

Iā€™m going to close this issue because I consider your answer very satisfactory, especially to the part where configuration improvements are planned. Iā€™m glad we sorted everything, it will be useful for other users. I started looking into kubelet settings when I saw a cluster kubelet crashing because it was out of memory.

from k3s.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.