Git Product home page Git Product logo

Comments (9)

bwplotka avatar bwplotka commented on May 25, 2024 4

Fix is now released https://groups.google.com/g/prometheus-announce/c/D6RfOzypmr0 Thanks everyone! 💪🏽

from prometheus.

hacklschorsch avatar hacklschorsch commented on May 25, 2024 2

Thanks for catching so early!

The change made it into Nixos-23.11 and broke our staging system beginning of this week.
We're waiting for the fix to arrive downstream.

Can I say that I think this shouldn't happen in a release?

Can we add a test to make sure Prometheus is sending out valid HTTP requests?

Edit: Add something like https://pypi.org/project/httplint/ maybe?

from prometheus.

frittentheke avatar frittentheke commented on May 25, 2024 1

It appears that the q-value of "2" might be the issue here. The HTTP RFC (https://www.rfc-editor.org/rfc/rfc9110.html#quality.values) regarding this value states that the allowed range

is normalized to a real number in the range 0 through 1, where 0.001 is the least preferred and 1 is the most preferred

Playing around with curl let's one quickly reproduce the issue:

Prometheus 2.49.0-like:
curl -v http://RABBITMQ:6666/metrics/detailed?family=queue_coarse_metrics&family=queue_consumer_count&family=channel_metrics' --header 'Accept: application/openmetrics-text;version=1.0.0;q=0.5,application/openmetrics-text;version=0.0.1;q=0.4,text/plain;version=0.0.4;q=0.3,*/*;q=2' -> 400 Bad Request

With q changed to something valid:
curl -v http://RABBITMQ:6666/metrics/detailed?family=queue_coarse_metrics&family=queue_consumer_count&family=channel_metrics' --header 'Accept: application/openmetrics-text;version=1.0.0;q=0.5,application/openmetrics-text;version=0.0.1;q=0.4,text/plain;version=0.0.4;q=0.3,*/*;q=0.1' -> 200 OK

from prometheus.

ginkel avatar ginkel commented on May 25, 2024

AFAICS there is a fix in main, which however, does not seem to be part of the 2.49.0 release, cf. #13369

from prometheus.

tsuna avatar tsuna commented on May 25, 2024

We hit this too in another unrelated Java/Tomcat app, probably the same Java library under the hood that barks when the Accept header isn't perfectly well formatted. It would be nice to call out this issue on the release notes.

from prometheus.

bwplotka avatar bwplotka commented on May 25, 2024

Nice catch, I somehow thought the fix is included in the latest release, but checked wrongly. 2.49.1 on the way....

Thanks for catching so early!

from prometheus.

bwplotka avatar bwplotka commented on May 25, 2024

2.49.1 release with the fix for this issue is in progress, should be in ~1h

from prometheus.

bwplotka avatar bwplotka commented on May 25, 2024

Ack. Broke meaning somebody ran 2.49.0 from Nixos 23.11 and reported the problem? Or just in theory broken?

Anyway, definitely more tests would be nice, thanks for reminding us! Help wanted (added issue)

The other aspect is that not enough systems/users try/upgrade to RC images. This issue was there for ~1 month (albeit during xmas season) since RC.0 (2023-12-12-2024-01-15) (and even longer on main), yet no one noticed. It would be amazing to either remove RC process or make use of it. 🤗

from prometheus.

hacklschorsch avatar hacklschorsch commented on May 25, 2024

(Negative-plugging Nix // I just checked and NixOS-23.11 still doesn't have 2.49.1... IDK how the maintainer managed to hit that one hour window, and then missed the new release, but I want to point out that this happens and there's no such thing as "the bug was out there just one hour" and people who install Prometheus from the NixOS stable branch now, three weeks after you pushed out the fix, are going to hit this bug.) (Not wanting to seem ungrateful, thank you for your work!)

Broke meaning somebody ran 2.49.0 from Nixos 23.11 and reported the problem? Or just in theory broken?

Our staging environment can be broken, that's why we have it; Most of the distributed system is still running, only the monitoring of one of the main components isn't and is throwing us alerts instead ("Scraping down").
I could fix this by reverting the merge commit, but I think I'll just wait until 2.49.1 finds its way to us instead.

from prometheus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.