Comments (2)
Something is wrong with the liveness probe. I did an upgrade and two of the three slaves I have
[christopher@chris-msi ~]$ kubectl logs netdata-slave-pdclb
...
2019-04-29 18:25:27: netdata ERROR : MAIN : Failed to read machine GUID from '/var/lib/netdata/registry/netdata.public.unique.id'
2019-04-29 18:25:27: netdata FATAL : MAIN :Cannot create unique machine id file '/var/lib/netdata/registry/netdata.public.unique.id'. Please fix this. # : Invalid argument
...
After that, netdata exited. It looks like something went wrong with the preStart command that creates this file, probably a race condition. When I deleted the pod and it was recreated, it worked fine.
I'll try to replicate the issue with preStart, but the worrisome thing is that the two problematic pods were showing as running, instead of error. I'll try to get on this tomorrow, it's not good.
from helmchart.
I saw that we didn't have readiness/liveness probes in the daemonset (no web server running anyway), so I added them. It takes much longer for the slaves to show up as ready now, but I expect it to work better. I will close this issue with PR #19 and create a new issue to check on the apparent preStart race condition.
from helmchart.
Related Issues (20)
- Please support imagePullSecrets in values.yml HOT 1
- Please clarify how to collect metrics from RabbitMQ with auth HOT 20
- Disable netdata-parent HOT 1
- Provide instructions on how to run with a parent outside the k8s cluster HOT 4
- How to add PostgreSQL monitoring in the Kubernetes cluster HOT 1
- Helm upgrade fails 3.7.33 to 3.7.34, 3.7.35 or 3.7.36 HOT 2
- Incompatible with current versions of k8s (1.25) HOT 3
- Can't use ingressClassName HOT 3
- Add support for the nightlies channel HOT 5
- Specify an Alarm Configuration Example HOT 1
- Include default requests/limits for child pods HOT 2
- Netdata deployment issue: PersistentVolume provisioning failure and child pods not loading on k3s cluster HOT 5
- Add initialDelaySeconds to DaemonSet livenessProbe HOT 2
- Helm chart broken in recent releases when not using secrets HOT 4
- netdata state container: runaway FD use HOT 3
- Netdata parent pod keeps running into error HOT 9
- storedType not in values.yaml HOT 6
- Error: template: netdata/templates/secrets.yaml:1:21 HOT 5
- Impossibility to configure child agent nodes differently (for A/B testing, progressive alert rollout, etc)
- avoid child open port and fix liveness probe on public worker nodes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from helmchart.