Comments (9)
@MrHohn got the reload working by adding an annotation on the Deployment based on the config's sha256 sum, this pattern is described by Helm on its docs
This ensures pods will roll whenever the contents of the configmap change.
from cloudprober.
FWIW, one potential use case would be running cloudprober with kubernetes cluster. With the configuration mounted via a configMap (like what #89 does), we can easily implement dynamic probing by editing the configMap in-flight.
from cloudprober.
Dynamic reloading of config in cloudprober is non-trivial to implement. Cloudprober forks off quite a few goroutines to run probes and other components like servers and surfacers. It also opens a bunch of listening sockets. We use contexts in most cases to keep track and shutdown these goroutines, but it's hard to guarantee that everything will clean up nicely and in time, in the case of a context close.
That said, there are a few options here:
-
We can provide an option for cloudprober to quit in case of a config change (and if config parses successfully), and let the process/container manager restart cloudprober. This will be cheap and clean to implement but it's disruptive.
-
Have a sidecar process watch the config and restart cloudprober with new config in case of a config change. Sidecar process could then monitor the main process after the restart and if it fails to come back up and become healthy, it can restart the main process with the old config.
-
Have cloudprober reload the config dynamically. Implement tests to make sure that cloudprober goroutines and listener sockets can be stopped cleanly.
from cloudprober.
Thinking more about it, I am not sure if I understand the requirement and use case clearly.
@MrHohn, regarding your comment about reloading config in response to configMap changes, I was wondering if that will not be an unsafe thing to do? There will be no way to rollout this change slowly as configMaps themselves don't seem to have a way to rollout slowly. Can you please comment if that's okay in your use case? Also, will it be sufficient for cloudprober to quit itself in response to config change?
from cloudprober.
This stackoverflow thread discusses the same issue:
https://stackoverflow.com/questions/37317003/restart-pods-when-configmap-updates-in-kubernetes
from cloudprober.
Thanks for getting back @manugarg. Now I think about this, it makes much sense that we handle this one layer above cloudprober. As you said, it is non-trivial to implement and hard to guarantee safety with cloudprober alone.
Simply making clourprober quit itself when config changes may work but sounds a bit too tangled. It also won't have a way to stop the rollout if new config is found broken.
from cloudprober.
@cilindrox That's great. Thanks for sharing.
from cloudprober.
If this helps, I would like the reload to be implemented as I want to hot reload without restarting just like many other daemons do it.
If being worried about breaking config-maps, cloudprober could have the approach of keeping the old config if the reloaded config fails. Just like prometheus does, and expose a metric last_succesful_config_reload
that way the prober will keep running and you have a way to check if the config-map broke the config.
Off course this is only safe for syntax errors, doesn't cover errors where the logic is wrong but the syntax checks out.
from cloudprober.
@brat002, @fcastello, @MrHohn and other folks, Please see the announcement here: #679. Active development of this repository is being moved to github.com/cloudprober/cloudprober. I am closing this issue as I think the core of it is resolved by using config checksum in deployment annotations like this:
https://cloudprober.org/how-to/run-on-kubernetes/#deployment-map
This allows Kubernetes to rollout config changes, instead of Cloudprober rolling out changes by itself.
If you disagree, please feel free to file another issue at github.com/cloudprober/cloudprober.
Thanks,
Manu
from cloudprober.
Related Issues (20)
- Running On Kubernetes pod encounter an error on OpenStack HOT 6
- Support allowed_metrics_regex in Prometheus surfacer HOT 1
- Unable to connect gRPC for dynamic configuration of cloudprober
- Support specifying prober interval / timeouts as durations HOT 1
- RDS Kubernetes Endpoints include pod name HOT 5
- HTTP probe with `file_targets` can begin first iteration with 0 targets HOT 1
- GKE Logging StackDriver Metadata error HOT 11
- Default behaviour of RDS Filter and Probing multiple matched services HOT 4
- Datadog surfacer makes cloudprober binary too big HOT 2
- [Documentation] Document file surfacer output HOT 1
- Reduce resource consumption when using file discovery with same file in multiple probes HOT 23
- Document templating language HOT 5
- Implement caching option in RDS protocol HOT 1
- Allow debugging configuration templates, and playbook HOT 3
- file_targets for probe options HOT 5
- Metrics not updated on external probe timeout HOT 6
- `additional_label`'s are missing in custom metrics of `EXTERNAL` probe HOT 3
- Socket: permission denied when run in Kubernetes HOT 2
- Failed to publish metrics to cloudwatch: MissingRegion: could not find region configuration HOT 3
- Improve documentation for cloudwatch surfacer
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudprober.