Comments (9)
Another way to do it, is via current --receive.relabel-config
in router stage by exposing tenant as internal label (ex. __meta_tenant_id
) and allowing it's modifications:
- |
--receive.relabel-config=
- source_labels: [prometheus]
target_label: __meta_tenant_id
The same way it is done in mimir:
grafana/mimir#4725
Example of changes needed in Thanos:
44a0728#diff-42c21b7b04cc61ab0cda17794cc1efff14802e0e89a85503d28601e721c1dd31R849
from thanos.
Thanks Filip - do you think the proposal for how the feature would work makes sense? Iām a little worried to add too much overhead to the routing receives and cause requests to get backed up.
Wouldnt this be evaluated on the ingesting receiver when looking in which tenant the sample should be written to?
I would imagine that instead of one local write, we would inspect the request and group it by tenant and issue multiple local writes here
Line 793 in 4a73fc3
from thanos.
This would be a really cool feature indeed! We've tried to build a proxy that extracts the tenant from a label and sends one request per tenant with the appropriate header, but it overwhelmed receivers and was not worth the hassle. Having the feature natively built into Thanos is the way to go.
from thanos.
from thanos.
The ingesting receiver does not always have access to the hashring (e.g. in router-ingester split mode). So routers need to know which ingester to send samples to.
from thanos.
Had some discussions with @MichaHoffmann. We came to conclusion that the proposed implementation would completely break the current rate limiting concept...e.g. if 1 tenant in a batch of 20 is over the limit, what should Thanos do? 429 will be retried and result in the whole batch being ingested again. If we drop all metrics in the batch due to 1 tenant being over the limit, we have created a noisy neighbour problem. But if we accept the metrics from the valid 19/20 tenants, then we will have out of order issues.
So the current design is incompatible with per-tenant rate limiting as it stands.
from thanos.
@sepich I like that approach. Do you know how Mimir handles per-tenant limits in that situation?
from thanos.
@verejoel I think it has the same issues with the ratelimit since all the samples still come from one remote write request!
from thanos.
Implemented a PoC for this, works really well. A few caveats:
tenant
in Receiver HTTP metrics is embedded into thehttp.Handler
so now what I see is that everything falls under thedefault-tenant
tenant š¤·__meta_tenant_id
is a bad idea because afterrelabel_configs
all labels which begin with__meta
are trimmed. You can add it inmetric_relabel_configs
but for some reason it doesn't apply to meta-metrics likeup
from thanos.
Related Issues (20)
- alert.query-template added too much escaping to the expression HOT 2
- [receiver] high resource utilization during the time of uploading TSDB blocks HOT 8
- Ruler: Implement flag for max-source-resolution in the rule query HOT 7
- Thanos queries taking too long HOT 2
- Subject: Thanos Compactor Fails to Delete Downsampling Data, Resulting in Disk Space Overfill HOT 12
- Can't add Thanos stores from other EKS clusters
- sidecar, stores, and query container memory spikes HOT 4
- Query frontend,store: redis config sends "default" instead of username HOT 3
- Website does not have the correct trademark disclaimer
- Store Gateway query minio for index file with illegal characters (backslash) HOT 4
- Compactor:ThanosCompactHasNotRun HOT 1
- Compactor & Store pod has Permission denied for block HOT 1
- Can the Compactor deduplicate data for Prometheus TSDB blocks stored on disk?
- REST API For update Rules HOT 4
- Thanos Compactor Fails to Delete Downsampling Data & Thanos Store failed to block HOT 1
- Support for /api/v1/targets/metadata HOT 1
- offset table size exceeds 4 bytes & symbol table size exceeds 4 bytes || Thanos Compactor error HOT 2
- Broken dependency on github.com/grpc-ecosystem/go-grpc-middleware HOT 1
- compactor: Irregular compaction and downsampling HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thanos.