grafana / docker-otel-lgtm Goto Github PK

License: Apache License 2.0

Dockerfile 40.13% Shell 36.15% Batchfile 23.72%

docker-otel-lgtm's Introduction

docker-otel-lgtm

An OpenTelemetry backend in a Docker image.

The grafana/otel-lgtm Docker image is an open source backend for OpenTelemetry that’s intended for development, demo, and testing environments. If you are looking for a production-ready, out-of-the box solution to monitor applications and minimize MTTR (mean time to resolution) with OpenTelemetry and Prometheus, you should try Grafana Cloud Application Observability.

Documentation

Blog post: An OpenTelemetry backend in a Docker image: Introducing grafana/otel-lgtm

Get the Docker image

The Docker image is available on Docker hub: https://hub.docker.com/r/grafana/otel-lgtm

Run the Docker image

# Unix/Linux
./run-lgtm.sh

# Windows (PowerShell)
./run-lgtm

Run lgtm in kubernetes

# create k8s resources
kubectl apply -f k8s/lgtm.yaml

# port forwarding
kubectl port-forward service/lgtm 3000:3000 4317:4317 4318:4318

Send OpenTelemetry Data

There's no need to configure anything: The Docker image works with OpenTelemetry's defaults.

# Not needed as these are the defaults in OpenTelemetry:
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

View Grafana

Build the Docker image from scratch

cd docker/
docker build . -t grafana/otel-lgtm

Build and run the example app

Run the example REST service:

# Unix/Linux
./run-example.sh

# Windows (PowerShell)
./run-example

Generate traffic:

# Unix/Linux
./generate-traffic.sh

# Windows (PowerShell)
./generate-traffic

Run example apps in different languages

The example apps are in the examples/ directory. Each example has a run.sh or run.cmd script to start the app.

Every example implements a rolldice service, which returns a random number between 1 and 6.

Each example uses a different application port (to be able to run all applications at the same time).

Example	Service URL
Java	`curl http://localhost:8080/rolldice`
Go	`curl http://localhost:8081/rolldice`
Python	`curl http://localhost:8082/rolldice`
dotnet	`curl http://localhost:8083/rolldice`

Related Work

Metrics, Logs, Traces and Profiles in Grafana: https://github.com/grafana/intro-to-mltp

docker-otel-lgtm's People

Contributors

Stargazers

Watchers

docker-otel-lgtm's Issues

Investigate if we should switch to supervisord

The Docker image runs multiple services (OTel collector, Prometheus, Loki, Tempo, Grafana).

Currently we use run-all.sh as a wrapper script to run all of these services.

supervisord is a more sophisticated solution to run multiple services. We should investigate whether it's worthwhile to switch to supervisord.

https://docs.docker.com/config/containers/multi-service_container/

Mimir instead of Prometheus

It is LGTM stack, so I think Mimir should be used, not Prometheus. Is there any reason why Prometheus was used?

Provide simple Kubernetes YAML

For Kubernetes deployments, it may be nice to provide a simple YAML. Not hard to make one manually, but harder than copy+paste 🙂

Below shows a good start.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: lgtm
spec:
  selector:
    matchLabels:
      app: lgtm
  template:
    metadata:
      labels:
        app: lgtm
    spec:
      containers:
      - name: lgtm
        image: grafana/otel-lgtm
---
apiVersion: v1
kind: Service
metadata:
  name: lgtm
spec:
  selector:
    app: lgtm
  ports:
  - name: ui
    port: 3000
  - name: grpc
    port: 4317
  - name: https
    port: 4318

Override Prometheus Configuration?

I am trying to run LGTM but I want to add 1 extra scraper so that it scraps my MicroMeter Metrics from Quarkus. In my Standalone Prometheus Docker Compose i was doing the following

  prometheus:
    container_name: prometheus
    image: prom/prometheus
    volumes:
      - './monitoring/:/etc/prometheus/'
      - 'prometheus_data:/prometheus'
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - '9090:9090'
    networks:
      - back-tier

and the prometheus.yml i was using...

global:
 scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
 # scrape_timeout is set to the global default (10s).
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
 # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
 - job_name: 'quarkus-micrometer'
   metrics_path: '/q/metrics'
   scrape_interval: 15s
   static_configs:
     - targets: ['host.docker.internal:8080']

How can I add this to the LGTM stack so it starts scraping my http://localhost:8080/q/metrics api like it does in standalone Docker Compose? I didn't see any easy way but maybe I am missing it?

Remove grafana version from directory name

First of all, thank you for creating this image. It is simplifying a workflow I use all the time to prototype or quickly spin up telemetry examples.

I was curious if the version number being included in the grafana directory name is strict requirement? The reason I ask is because I will sometimes create a custom dashboard and want to mount it within the container automatically, but currently this requires me to know the exact version of grafana running within the container:

version: '3'
services:
  collector:
    image: grafana/otel-lgtm:latest
    ports:
      - "4317:4317"
      - "3000:3000"
    volumes:
      - ./grafana/dashboards:/otel-lgtm/dashboards
      - ./grafana/dashboards.yaml:/otel-lgtm/grafana-v10.4.1/conf/provisioning/dashboards/dashboards.yaml

It would be much simpler to maintain if grafana was installed into a generic grafana directory where I could more simply just do:

    volumes:
      - ./grafana/dashboards:/otel-lgtm/dashboards
      - ./grafana/dashboards.yaml:/otel-lgtm/grafana/conf/provisioning/dashboards/dashboards.yaml

Missing exemplars traceID link

Summary

Missing exemplars traceID link
missing urlDisplayLabel

What's wrong?

missing URL Label

Expected Behavior

TraceID link can click

Multi-arch image builds (ARM64 and AMD64)

I can see that the Dockerfile in this repo has been updated to support multi-arch image builds in #29.

It would be great if the process you currently use to build the image would trigger a multi-arch build, since right now the published images are AMD64 only.

Any way we could help with that? I'm available to update the build process, should the suggestion of introducing a GH Actions pipeline be accepted (#66). And achieve multi-arch builds, with also signatures and SLSA attestations.

Using otlp with prometheus and loki

As a user, thank you for creating otel-lgtm.

If using otlp in all components, it would be more easier to migrate from other components to lgtm, from lgtm to individual (prometheus, tempo, loki).

Can I contribute to this?

Tasks

Beta Give feedback

Enable out_of_order_time_window in prometheus _#42
Replace prometheusremotewriteexporter to otlpexporter #42
Bump loki version to 3.0.0 #41
fix regression: #48
Replace lokiexporter to otlpexporter #44
Options

Clean Data for Running Instance

Hi! Is there a prescribe way to wipe the data for an existing instance of otel-lgtm without restarting? The docker container can be very expensive to start up.

Thank you!

http://otel-lgtm:3000/api/user/auth-tokens/rotate called in a loop

After longing in to otel-lgtm it goes in to a endless loop calling api/user/auth-tokens/rotate, this docker container was running for about 5 days. As you can see the test is running in a incognito browser so there shouldn't be any cache influencing it.

Screencast.from.2024-05-08.08-47-12.webm

bugs: duplicated datasources

I think that these are duplicates, and therefore not necessary.
then, one of them does not appear.

docker-otel-lgtm/docker/grafana-datasources.yaml

Lines 5 to 19 in 575f713

 - name: Prometheus 

 type: prometheus 

 uid: prometheus 

 url: http://localhost:9090 

 - name: Tempo 

 type: tempo 

 uid: tempo 

 url: http://localhost:3200 

 - name: Loki 

 type: loki 

 uid: loki 

 url: http://localhost:3100

Document settings required to populate Dashboards.

I've been trying out this docker image as someone new to both the Grafana products and OpenTelemetry in general. I believe I'm your target audience. That said, I'm struggling to get any of the three sample Dashboards to populate with metrics using the OpenTelemetry Java Agent with my own application. I can confirm instrumentation is working because I'm able to see some metrics using Explore, I'm also seeing Traces and Logs populated as well but all the metrics dashboards remain obstinately blank.

Here are my settings for Java Agent 2.2.0:

# Settings for the opentelemetry java agent

# in ms...
otel.bsp.schedule.delay=5000
otel.metric.export.interval=5000

otel.exporter.otlp.metrics.default.histogram.aggregation=base2_exponential_bucket_histogram

# capture enduser info...
otel.instrumentation.common.enduser.enabled=true
otel.instrumentation.common.enduser.id.enabled=true

otel.instrumentation.common.peer-service-mapping=foo-host:8082=foo-service

otel.semconv-stability.opt-in=http

otel.resource.attributes=service.version=HEAD-SNAPSHOT
otel.service.name=my-service

The above settings are trying to get the JVM and RED (native histograms) dashboards to populate.

General feedback:

Having two RED Metrics dashboards is confusing. Strongly recommend a single RED dashboard using the metrics you view to be the best / future-proof path for someone new to OpenTelemetry. After a bunch of Googling I think that might be the "native histogram" one but I'm still not sure!
Please document the settings required to populate the Dashboards. Yes, I did see the description at the top for the two RED Metrics dashboards but it would be great to add the property settings (in addition to environment variables) or simply a link to a more detailed page on this github project. The JVM Metrics dashboard is missing any instructions so I have no idea how to get it to populate (in Explore view I can see some jvm_* metrics are populated so I dunno?!
Consider enabling github Discussions on this project (this issue would be more appropriate as a discussion post but I can't figure out where to post feedback!)

Support for ARM64

I'm trying out the Docker image on my Macbook (with ARM CPU), and I'm seeing the following error:

grafana-lgtm The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

I'm not seeing any traces are coming in, in Tempo, but I'm not sure if it's related (but I do not think so, since I've never had it working before :p).

Not sure it's already on the roadmap.

As a workaround I've added:

  platform: linux/amd64

to my compose.yaml file.

Documentation for shipping server logs/metrics via alloy

Once this is setup, there is an OTLP endpoint for collecting logs/metrics. That said, there isn't really an end-to-end setup for how to configure this with Alloy. Would it be possible for someone to write a short tutorial on how to do that?

Setup TraceQL metrics

Description

Configure in Tempo in a way such that it supports the in Tempo 2.4 Introduced TraceQL metrics: https://grafana.com/docs/tempo/latest/operations/traceql-metrics/

Current Behavior

Dockerfile doesn't build

I am trying to build the Docker image using the following commands:

cd docker/
docker build . -t grafana/otel-lgtm

The build fails with the following message

Step 12/32 : RUN curl -sOL https://dl.grafana.com/oss/release/grafana-$GRAFANA_VERSION.linux-${TARGETARCH}.tar.gz &&     tar xfz grafana-$GRAFANA_VERSION.linux
-${TARGETARCH}.tar.gz &&     rm grafana-$GRAFANA_VERSION.linux-${TARGETARCH}.tar.gz                                                                            
 ---> Running in a696aef13041                                                                                                                                  
                                                                                                                                                               
gzip: stdin: not in gzip format                                                                                                                                
tar: Child returned status 1                                                                                                                                   
tar: Error is not recoverable: exiting now

This is because of the TARGETARCH that was introduced with #37. The variable is not set on my machine (Linux with Docker 26.0.2).

@zeitlinger @devops-42 do you have any idea what's wrong? If TARGETARCH is not supported by a default Docker installation we should not rely on it.

Not seeing logs in Python example.

I've been using this as a local LGTM version of our Cloud account as we migrate to OTEL and I'm just not seeing logs from the Python example. I couldn't see logs from my application either so I'm wondering if the auto-instrumentation of Python logging is actually working and wondered if you knew of a way to debug this.

Add a way to configure retention

Prometheus retention is configured via flag and not config, and thus there isn't a way right now to say "keep N weeks/months of data" without customizing the image. It would be great if we could somehow supply the retention (or even other flags) for use with prometheus.

add opentelemetry-go logging example

The dice example was recently updated

Tasks

Beta Give feedback

publish new otel-lgtm image with latest grafana ( 11.1.0)

I tried building local docker image , it build successfully but fails to run with error "/bin/sh: /otel-lgtm/run-all.sh: /bin/bash^M: bad interpreter: No such file or directory"

cd docker
docker build . -t grafana/otel-lgtm
docker run --name lgtm -p 3000:3000 -p 4317:4317 -p 4318:4318 --rm -ti grafana/otel-lgtm:latest

Error during docker run

Tested on
WIndows 11 With Docker Desktop version 4.31.1

docker info

Server:
 Containers: 52
  Running: 51
  Paused: 0
  Stopped: 1
 Images: 63
 Server Version: 26.1.4
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d2d58213f83a351ca8f528a95fbd145f5654e957
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
 Kernel Version: 5.15.153.1-microsoft-standard-WSL2
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 7.609GiB
 Name: docker-desktop
 ID: 5c78005b-8f21-4360-b074-0ce826847485
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=npipe://\\.\pipe\docker_cli
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

Overriding Grafana settings

Is it possible to override the configured Grafana settings environment variables?

Set explicit TLSconfig

Follow the official example, Set explicit TLSconfig
https://github.com/open-telemetry/opentelemetry-collector/tree/main/config/configtls#client-configuration

example

  otlp/insecure:
    endpoint: myserver.local:55690
    tls:
      insecure: true

Tasks

Beta Give feedback

docs: adds some information about persistant data

I noted that the docker is not persistent so with every restart the data are wiped.

it is possible to add some information to keep the data stored in the host disk?

Enable Tempo spanmetrics

Tempo spanmetrics seams to not be enabled, would you accept a PR for that?

Logs linked to span broke in 0.6.0

After upgrading to 0.6.0 my demo broke when trying to retrieve logs for a given span by clicking the "Logs for this span" button. It seems like the job label filter in loki is no longer available.

regression: otlp(otlpgrpc) exporter not supported by prometheus

regression from #42, sorry.
- should switch otlphttp, not otlp(otlpgrpc)
- prometheus not supported otlp(otlpgrpc) yet.

	- name: Prometheus
	type: prometheus
	uid: prometheus
	url: http://localhost:9090

	- name: Tempo
	type: tempo
	uid: tempo
	url: http://localhost:3200

	- name: Loki
	type: loki
	uid: loki
	url: http://localhost:3100

grafana / docker-otel-lgtm Goto Github PK

docker-otel-lgtm's Introduction

docker-otel-lgtm

Documentation

Get the Docker image

Run the Docker image

Run lgtm in kubernetes

Send OpenTelemetry Data

View Grafana

Build the Docker image from scratch

Build and run the example app

Run example apps in different languages

Related Work

docker-otel-lgtm's People

Contributors

Stargazers

Watchers

Forkers

docker-otel-lgtm's Issues

Summary

What's wrong?

Expected Behavior

Tasks

Description

Current Behavior

Tasks

Error during docker run

example

Tasks

Recommend Projects

Recommend Topics

Recommend Org