Git Product home page Git Product logo

kafka-lag-monitor's Introduction

CircleCI Coverage Status Quality Gate Docker Pulls Lines of Code Docker Image Size

kafka-lag-monitor

Monitors kafka lag and publishes the metrics to different metrics backends

Metrics

The supported metrics backends are Prometheus and InfluxDB

Sample metrics

Prometheus:

The metrics in Prometheus format can be accessed at /prometheus endpoint

# HELP kafka_consumer_lag_max  
# TYPE kafka_consumer_lag_max gauge
kafka_consumer_lag_max{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 2.0
kafka_consumer_lag_max{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 2.0
# HELP kafka_consumer_lag  
# TYPE kafka_consumer_lag summary
kafka_consumer_lag_count{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 1.0
kafka_consumer_lag_sum{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 2.0
kafka_consumer_lag_count{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 1.0
kafka_consumer_lag_sum{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 2.0
# HELP kafka_consumer_offset  
# TYPE kafka_consumer_offset summary
kafka_consumer_offset_count{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 1.0
kafka_consumer_offset_sum{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 16.0
kafka_consumer_offset_count{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 1.0
kafka_consumer_offset_sum{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 13.0
# HELP kafka_consumer_offset_max  
# TYPE kafka_consumer_offset_max gauge
kafka_consumer_offset_max{cluster_name="test-cluster",group="test-consumer",partition="1",topic="test-topic",} 16.0
kafka_consumer_offset_max{cluster_name="test-cluster",group="test-consumer",partition="0",topic="test-topic",} 13.0
# HELP kafka_partition_offset  
# TYPE kafka_partition_offset summary
kafka_partition_offset_count{cluster_name="test-cluster",partition="1",topic="test-topic",} 1.0
kafka_partition_offset_sum{cluster_name="test-cluster",partition="1",topic="test-topic",} 18.0
kafka_partition_offset_count{cluster_name="test-cluster",partition="0",topic="test-topic",} 1.0
kafka_partition_offset_sum{cluster_name="test-cluster",partition="0",topic="test-topic",} 15.0
# HELP kafka_partition_offset_max  
# TYPE kafka_partition_offset_max gauge
kafka_partition_offset_max{cluster_name="test-cluster",partition="1",topic="test-topic",} 18.0
kafka_partition_offset_max{cluster_name="test-cluster",partition="0",topic="test-topic",} 15.0

Influxdb:

Metrics in InfluxDB's line protocol format will be reported by default to http://localhost:8086/write endpoint, every minute

kafka_consumer_lag,cluster_name=test-cluster,group=test-consumer,partition=0,topic=test-topic,metric_type=histogram sum=2,count=1,mean=2,upper=2 1612125711313
kafka_consumer_lag,cluster_name=test-cluster,group=test-consumer,partition=1,topic=test-topic,metric_type=histogram sum=2,count=1,mean=2,upper=2 1612125711311
kafka_consumer_offset,cluster_name=test-cluster,group=test-consumer,partition=0,topic=test-topic,metric_type=histogram sum=13,count=1,mean=13,upper=13 1612125711307
kafka_consumer_offset,cluster_name=test-cluster,group=test-consumer,partition=1,topic=test-topic,metric_type=histogram sum=16,count=1,mean=16,upper=16 1612125711308
kafka_partition_offset,cluster_name=test-cluster,partition=0,topic=test-topic,metric_type=histogram sum=15,count=1,mean=15,upper=15 1612125711311
kafka_partition_offset,cluster_name=test-cluster,partition=1,topic=test-topic,metric_type=histogram sum=18,count=1,mean=18,upper=18 1612125711313

Usage

docker run --rm \
        -p 8080:8080  \
        -v /path/to/config:/config \
        -e MICRONAUT_CONFIG_FILES=/config/application.yml \
        -e MICRONAUT_METRICS_EXPORT_INFLUX_ENABLED=false \
        devatherock/kafka-lag-monitor:latest

Configurable properties

application.yml variables

kafka:
  clusters: # Required. A list of kafka cluster definitions
    - name: test-cluster # Required. Name of the cluster. The same name will be needed in `kafka.lag-monitor.clusters[*].name` config. 
      servers: test-cluster.test.com:9092 # Required. The server(s)/broker(s) that belong to this cluster
  lag-monitor:
    clusters:
      - name: test-cluster # Required. Name of the cluster to monitor. Should be one of the defined `kafka.clusters[*].name`
        consumer-groups: # Optional. List of consumer group names to monitor. Names will be matched exactly. Use `group-allowlist` for regex match
          - test-consumer
        group-allowlist: # Optional. List of regular expressions to match against consumer group names to monitor. Will be ignored if `consumer-groups` is specified
          - deva.*
        group-denylist: # Optional. List of regular expressions to match against consumer group names to exclude. Will be ignored if `consumer-groups` or `group-allowlist` is specified
          - temp.*
    threadpool-size: 5 # Optional. Size of the thread pool used by the lag monitor. Defaults to 5
    timeout-seconds: 5 # Optional. Timeout for the requests to Kafka, in seconds. Defaults to 5
    initial-delay-seconds: 60 # Optional. Initial delay before metric collection begins, in seconds. Defaults to 60
    interval-seconds: 60 # Optional. Metric collection interval, in seconds. Defaults to 60
micronaut:
  server:
    port: 8080 # Optional. Port in which the app listens on
  metrics:
    export:
      influx: # Config for publishing metrics to Influxdb
        enabled: false # Optional. Indicates if metrics reporting to Influxdb is enabled. Defaults to true
        uri: https://some.influx.host # Optional. The HTTP endpoint exposed by Influxdb, to which to report metrics. Defaults to http://localhost:8086

Environment variables

Environment Variable Name Required Default Description
KAFKA_LAG_MONITOR_THREADPOOL_SIZE false 5 Size of the thread pool used by the lag monitor
KAFKA_LAG_MONITOR_TIMEOUT_SECONDS false 5 Timeout for the requests to Kafka, in seconds
LOGGER_LEVELS_ROOT false INFO SLF4J log level, for all(framework and custom) code
LOGGER_LEVELS_IO_GITHUB_DEVATHEROCK false INFO SLF4J log level, for custom code
MICRONAUT_SERVER_PORT false 8080 Port in which the app listens on
MICRONAUT_CONFIG_FILES true (None) Path to YAML config files. The YAML files can be used to specify complex, object and array properties
MICRONAUT_METRICS_EXPORT_INFLUX_ENABLED false true Indicates if metrics reporting to Influxdb is enabled
MICRONAUT_METRICS_EXPORT_INFLUX_URI false http://localhost:8086 The HTTP endpoint exposed by Influxdb, to which to report metrics
LOGBACK_CONFIGURATION_FILE false (None) Path to logback configuration file

Troubleshooting

Enabling debug logs

  • Set the environment variable LOGGER_LEVELS_ROOT to DEBUG to enable all debug logs - custom and framework
  • Set the environment variable LOGGER_LEVELS_IO_GITHUB_DEVATHEROCK to DEBUG to enable debug logs only in custom code
  • For fine-grained logging control, supply a custom logback.xml file and set the environment variable LOGBACK_CONFIGURATION_FILE to /path/to/custom/logback.xml

JSON logs

To output logs as JSON, set the environment variable LOGBACK_CONFIGURATION_FILE to logback-json.xml. Refer logstash-logback-encoder documentation to customize the field names and formats in the log

kafka-lag-monitor's People

Contributors

devatherock avatar renovate-bot avatar renovate[bot] avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

kafka-lag-monitor's Issues

API to expose consumer group lag

The URL can be /kafka/lag?cluster={cluster_name}&group={consumer_group}, with an optional topic parameter. The response can be in the below format:

{
    "total_lag": 3,
    "lag_by_partition": [{
        "partition": 0,
        "lag": 1,
        "topic": "test-topic"
    }, {
        "partition": 1,
        "lag": 2,
        "topic": "test-topic"
    }]
}

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: renovate.json
Error type: The renovate configuration file contains some invalid settings
Message: Invalid configuration option: extends[1].0, extends: preset value is not a string

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.