Git Product home page Git Product logo

Comments (9)

sepich avatar sepich commented on June 3, 2024 2

Another way to do it, is via current --receive.relabel-config in router stage by exposing tenant as internal label (ex. __meta_tenant_id) and allowing it's modifications:

- |
    --receive.relabel-config=
    - source_labels: [prometheus]
      target_label: __meta_tenant_id

The same way it is done in mimir:
grafana/mimir#4725

Example of changes needed in Thanos:
44a0728#diff-42c21b7b04cc61ab0cda17794cc1efff14802e0e89a85503d28601e721c1dd31R849

from thanos.

MichaHoffmann avatar MichaHoffmann commented on June 3, 2024 1

Thanks Filip - do you think the proposal for how the feature would work makes sense? Iā€™m a little worried to add too much overhead to the routing receives and cause requests to get backed up.

Wouldnt this be evaluated on the ingesting receiver when looking in which tenant the sample should be written to?

I would imagine that instead of one local write, we would inspect the request and group it by tenant and issue multiple local writes here

h.sendLocalWrite(ctx, writeDestination, params.tenant, localWrites[writeDestination], responses)
, right? That would happen on the ingester I think!

from thanos.

fpetkovski avatar fpetkovski commented on June 3, 2024

This would be a really cool feature indeed! We've tried to build a proxy that extracts the tenant from a label and sends one request per tenant with the appropriate header, but it overwhelmed receivers and was not worth the hassle. Having the feature natively built into Thanos is the way to go.

from thanos.

verejoel avatar verejoel commented on June 3, 2024

from thanos.

fpetkovski avatar fpetkovski commented on June 3, 2024

The ingesting receiver does not always have access to the hashring (e.g. in router-ingester split mode). So routers need to know which ingester to send samples to.

from thanos.

verejoel avatar verejoel commented on June 3, 2024

Had some discussions with @MichaHoffmann. We came to conclusion that the proposed implementation would completely break the current rate limiting concept...e.g. if 1 tenant in a batch of 20 is over the limit, what should Thanos do? 429 will be retried and result in the whole batch being ingested again. If we drop all metrics in the batch due to 1 tenant being over the limit, we have created a noisy neighbour problem. But if we accept the metrics from the valid 19/20 tenants, then we will have out of order issues.

So the current design is incompatible with per-tenant rate limiting as it stands.

from thanos.

verejoel avatar verejoel commented on June 3, 2024

@sepich I like that approach. Do you know how Mimir handles per-tenant limits in that situation?

from thanos.

MichaHoffmann avatar MichaHoffmann commented on June 3, 2024

@verejoel I think it has the same issues with the ratelimit since all the samples still come from one remote write request!

from thanos.

GiedriusS avatar GiedriusS commented on June 3, 2024

Implemented a PoC for this, works really well. A few caveats:

  • tenant in Receiver HTTP metrics is embedded into the http.Handler so now what I see is that everything falls under the default-tenant tenant šŸ¤·
  • __meta_tenant_id is a bad idea because after relabel_configs all labels which begin with __meta are trimmed. You can add it in metric_relabel_configs but for some reason it doesn't apply to meta-metrics like up

from thanos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.