Comments (2)
Thank you for providing the logs!
This looks like a bug and not it Is not clear if it is a regression. I will try to reproduce and provide a fix.
from prometheus.
I just confirmed my hunch:
This panic can happen if the configuration for the rule groups is reloaded while prometheus is being shutdown:
- t1: Ruler is shutdown here after the process receive the TERM signal here. During shutdown, the groups are closed.
- t2: process receive the HUP signal here
and configuration is reloaded. Please notice that this actor hasn't received the cancel from the channel yet because the shutdown of actors is processed sequentially. During the configuration reload, the groups are closed again, causing the panic.
One approach to solve this problem that I'm thinking of is to make the Reload
method in the manager noop in case the manager has already stopped. It does not make sense to reload a closed manager.
Another alternative is to make the Reload
function error in case it is already closed, but also include a flag that allow consumers of the manager to know its state so that they don't call Reload
.
I'm inclined to implement the first approach.
from prometheus.
Related Issues (20)
- remote write: all samples lost when server returns 500 HOT 1
- Running `promtool` with * from the docker image doesn't behave the same way with snap HOT 4
- Provide alternative distro for prometheus image HOT 1
- How to effectively monitor abnormal occurrences of CPU and memory usage in containers using PrometheusQL expressions?
- API: Consider delivering _info_ level annotation in separate JSON array HOT 1
- [prometheus] Secure Prometheus with enabling Ldap authentication HOT 1
- EC2 service discovery: show scrape pool in targets page when configured but no targets found yet HOT 1
- field path not found in type config.plain HOT 4
- time shifts, prometheus memory go high and no gc HOT 6
- wrong /-/healthy response when storage is not available HOT 1
- Collected value is not increased and remains frozen? But in scrape target it's increasing. HOT 2
- Mark remote read protocol and serving impl as stable. HOT 1
- The impact of reducing the Prometheus scrape interval HOT 2
- `mergeGenericQuerier` should exit early when the limit is hit
- Prometheus remote_write server returned x-protobuf is not supported HOT 2
- Prometheus alerting endpoints not updated when using kubernetes_sd_config HOT 2
- Prometheus has high memory usage problem HOT 1
- histograms: Handle unknown schemas gracefully and more robust HOT 1
- promql engine does not return expected results with mixed floats+histograms HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from prometheus.