Git Product home page Git Product logo

wg-prometheus's Introduction

OpenTelemetry Prometheus Working Group

This repository is used to track the progress of the Prometheus workgroup to address compatibility gaps between OpenTelemetry and Prometheus, and improve the OpenTelemetry Prometheus support.

The working group is working on:

The progress of the group can be tracked here.

Useful docs:

wg-prometheus's People

Contributors

alolita avatar andreimatei avatar rakyll avatar sergeykanzhelev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wg-prometheus's Issues

Clarify the meaning and purpose of external labels

External labels were discussed in the 4/14 Prometheus-OTel-WG SIG meeting.

The Prometheus documentation describes external labels:

  # The labels to add to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    [ <labelname>: <labelvalue> ... ]

When we say that "labels are added", no semantic interpretation is given. This implies:

  • some external labels describe the process being monitored (e.g., datacenter name)
  • some external labels describe how the process was monitored (e.g., replica name)

It seems we have a mix of descriptive and non-identifying attributes. OpenTelemetry has not formally added a mechanism to distinguish different kinds of attribute, but it appears increasingly important that we do this. In today's Prometheus/Cortex environment, the backend system has to be configured to recognize duplicate streams of information. I would like for OTLP to include a formal way to encode duplicate streams of information, which means distinguishing identifying and descriptive attributes from those that are non-identifying.

The terminology used here is developed in open-telemetry/opentelemetry-specification#1298, where it seems we have three kinds of attribute: identifying (e.g., "job", "instance"), descriptive (e.g., data center, k8s node), and non-identifying (e.g., replica name).

One way we can expose this information in the OTLP protocol that appears promising to me is with the use of schemas, see open-telemetry/oteps#152.

Add scrape target update endpoint to the Collector's Prometheus receiver

In order to support distribution of Prometheus scrape targets among a set of Collector instances as required for #6 we will need to create a mechanism for updating the set of scrape targets used by any given Collector instance. The Prometheus receiver should be extended with a server that can process requests to update the scrape targets for a given job. It should receive a list of Prometheus static_config entries that are written to a file the receiver is watching with a file_sd_config. This service should only be active when enabled by a new configuration option for the Prometheus receiver that is disabled by default. The configuration should also allow the user to specify the port the service will listen on along with any other service configuration items that may be appropriate.

This issue should be used to track the design of this service and to aggregate any other issues or PRs created during implementation.

Decide on performance benchmarking criteria

The work we planned for Phase 1 will be mainly about stability and performance. In order to achieve our performance goals, we will need to track the improvements and regressions. We are considering to benchmark the entire Phase 1 pipeline (Prometheus receiver -> collector -> Prometheus remote write exporter) and potentially will contribute micro benchmarks as needed. We need to decide on what to benchmark, what platforms we should run the benchmarks on, and dimensions.

Prior work

Previously, we ran manual benchmarks on Kubernetes (EKS), on a cluster with 10 m5.8xlarge nodes. On Kubernetes, the collection scales based on how many jobs running in the entire cluster and how many metrics generated per job. The total number of jobs running in a cluster is capped by the resources available to the cluster. We used a simple app that exposes a lightweight HTTP server that publishes a given number of metrics. The metrics are collected by the OTEL Prometheus receiver and exported to Amazon Managed Service for Prometheus (AMP).

We published 40 160, 400 and 1000 metrics from each server and ran 25, 50, 100, 250 and 500 replicas of the server and measured resource usage, export rate (samples per second), dropped vs exported metric samples. The scraper is configured to scrape at 15 seconds and this is a more aggressive setting than what our users will use. Scraping frequency only became a bottleneck when 1000 metrics are exported from 50+ replicas.

This work mainly targeted Kubernetes and might perform differently on a platform with a Prometheus discovery driver.

Prometheus Service Discovery Configuration Interception

To enable the OTel Operator to perform Prometheus scrape target identification for a set of Collector instances in support of #6 we need to be able to identify and extract all *_sd_config and relabel_config entries in each scrape_config entry in the Prometheus receiver configuration. All *_sd_config entries should be replaced by a single file_sd_config entry referencing a file that can be updated by the Collector Prometheus receiver's target update mechanism (to be constructed) prior to the configuration being used to create a ConfigMap for Collector instances. The extracted configurations should be preserved for use by the target discovery and distribution mechanism to be built in the Operator.

This issue should be used to track the design of the SD configuration interception mechanism in the OTel Operator and to aggregate any other issues or PRs created during implementation.

Support retry mechanisms similar to Prometheus server, allow fine tuning

Add StatefulSet support to OTel Operator

As outlined in #14, we want to be able to deploy the Collector in a horizontally-scaled configuration as a StatefulSet. Combined with other work deriving from #6 this will enable efficiently scaling to a large number of scrape targets.

This issue should be used to track the design of StatefulSet management in the OTel Operator and to aggregate any other issues and PRs created during implementation.

Support write-ahead log (WAL) capabilities similar to Prometheus server

Support WAL capabilities similar to Prometheus server. The Grafana Cloud Agent provides a reference implementation - see https://github.com/grafana/agent

OTel Component: https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/prometheusreceiver

Also see Brian Brazil's post: https://www.robustperception.io/how-much-space-does-the-wal-take-up

@dashpole suggested that since WAL capabilities are generally useful and not only useful to a Prometheus server, it may be worth doing a separate processor instead of in the Prometheus receiver.

[Tracker Issue] Design a Prometheus-specific CRD for the Operator

Potentially, we are going to build a Prometheus Operator to manage the Prometheus autoscaling, sharding and other high-level deployment configuration. Design a CRD document to discuss deployment and scheduling specific configuration and discuss with the OpenTelemetry Operator project to see if there are any potential breaking changes.

Support Prometheus histograms

Prometheus uses le, OTel uses ge bounded buckets. The two are mathematically incompatible and impossible to transform from one into the other.

Which prometheus "internal" metrics are required for conformance?

Currently, conformance tests require the 'up' metric be present. However, there are other internal metrics, such as scrape_duration_seconds, scrape_samples_post_metric_relabeling, scrape_samples_scraped, and scrape_series_added that a prometheus server would also produce that are designed to make debugging prometheus endpoints easier.

I asked this question at the wg today, and we weren't sure which would be required by conformance. There is a separate question of which are useful, but we should probably start with ones that are required.

@RichiH agreed to raise this with question with prometheus folks.

Prometheus Histogram edge case which we don't support

Form the documentation sum is not always present and MUST NOT be present in some scenarios, see:

Negative threshold buckets MAY be used, but then the Histogram MetricPoint MUST NOT contain a sum value as it would no longer be a counter semantically.

This is a very strange edge case to make things work with Prometheus, and I don't know how we can support this and enforce it.

One option is to have open-telemetry/opentelemetry-proto#187 in the data model. This will not work that good, because receiving an OM without a SUM may imply negative buckets are "possibly" present, but it is not guaranteed to be, and next point may contain a SUM? <- @RichiH is this possible?

Remote write compliance: TestRemoteWrite/otelcollector/InstanceLabel

The OpenTelemetry collector is not passing https://github.com/prometheus/compliance/tree/main/remote_write
TestRemoteWrite/otelcollector/InstanceLabel.

=== CONT  TestRemoteWrite/otelcollector/InstanceLabel
    helpers.go:21:
        	Error Trace:	helpers.go:21
        	            				helpers.go:52
        	            				helpers.go:13
        	            				instance_label.go:26
        	            				main_test.go:101
        	            				main_test.go:65
        	Error:      	Should be true
        	Test:       	TestRemoteWrite/otelcollector/InstanceLabel
        	Messages:   	label 'instance' not found

PrometheusReceiver Ignores Timeseries (Histogram and Summary) Metrics without "_sum" counter

Currently, when Timeseries data is scraped by Prometheus Scraper, it expects bucketed data and two counters along with those buckets - _count and _sum.
Some frameworks do no capture the _sum and hence do not produce the _sum counter in the Prometheus Exp format. While the Scraper and Prometheus work fine and display appropriate graphs without this counter, Prometheus receiver on the other hand expects this counter to be present for all time series metrics and if not present silently ignores the metric and associated datapoints.

Is this really the desired behavior for PrometheusReceiver?

Clarify how Prometheus uses the OpenMetrics "Created" timestamp

The OpenMetrics specification states for Counter metrics:

A MetricPoint in a Metric with the type Counter SHOULD have a Timestamp value called Created. This can help ingestors discern between new metrics and long-running ones it did not see before.

A MetricPoint in a Metric's Counter's Total MAY reset to 0. If present, the corresponding Created time MUST also be set to the timestamp of the reset.

The OpenTelemetry data model agrees that this field is useful, and that it should be optional. We have argued that when the Created / Start time is not set, it is possible to miss process restarts, and thus undercount metrics for short-lived processes.

We are trying to define the proper translation into OTLP for metric points when the Created time is not known. This is relevant in https://github.com/lightstep/opentelemetry-prometheus-sidecar, which reads the WAL and writes OTLP metric streams. We believe that a Created / Start time can be filled in by any stateful observer that is able to remember the last value and its timestamp.

When a stateful observer possesses this information, we believe that processor SHOULD fill in the missing start timestamp.

The issue here is investigatory. Does Prometheus have plans to use the OpenMetrics Created timestamp and eventually include that in its WAL?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.