Git Product home page Git Product logo

prometheus-nats-exporter's Introduction

License Build Coverage

The Prometheus NATS Exporter

The Prometheus NATS Exporter consists of both a package and an application that exports NATS server metrics to Prometheus for monitoring. The exporter aggregates metrics from the server monitoring endpoints you choose (varz, connz, subz, routez, healthz) from a NATS server into a single Prometheus exporter endpoint.

Build

make build

If you want to run tests, you can do this.

make test
make lint

# If you want to see the coverage locally, then run this.
# make test-cover

Run

Start the prometheus-nats-exporter executable, and poll the varz metrics endpoints of the NATS server located on localhost configured with a monitor port of 5555.

prometheus-nats-exporter -varz "http://localhost:5555"

To run with docker, you can use the following image:

docker run natsio/prometheus-nats-exporter:latest

Usage

prometheus-nats-exporter <flags> url
  -D	Enable debug log level.
  -DV
    	Enable debug and trace log levels.
  -V	Enable trace log level.
  -a string
    	Network host to listen on. (default "0.0.0.0")
  -addr string
    	Network host to listen on. (default "0.0.0.0")
  -channelz
    	Get streaming channel metrics.
  -connz
    	Get connection metrics.
  -connz_detailed
    	Get detailed connection metrics for each client. Enables flag "-connz" implicitly.
  -healthz
        Get health metrics.
  -gatewayz
    	Get gateway metrics.
  -accstatz
    	Get accstatz metrics.
  -leafz
    	Get leaf metrics.
  -http_pass string
    	Set the password for HTTP scrapes. NATS bcrypt supported.
  -http_user string
    	Enable basic auth and set user name for HTTP scrapes.
  -jsz string
    	Select JetStream metrics to filter (e.g streams, accounts, consumers, all)
  -l string
    	Log file name.
  -log string
    	Log file name.
  -p int
    	Port to listen on. (default 7777)
  -path string
    	URL path from which to serve scrapes. (default "/metrics")
  -port int
    	Port to listen on. (default 7777)
  -prefix string
    	Replace the default prefix for all the metrics.
  -r string
    	Remote syslog address to write log statements.
  -remote_syslog string
    	Write log statements to a remote syslog.
  -replicatorVarz
    	Get replicator general metrics.
  -ri int
    	Interval in seconds to retry NATS Server monitor URL. (default 30)
  -routez
    	Get route metrics.
  -s	Write log statements to the syslog.
  -serverz
    	Get streaming server metrics.
  -subz
    	Get subscription metrics.
  -syslog
    	Write log statements to the syslog.
  -tlscacert string
    	Client certificate CA for verification (used with HTTPS).
  -tlscert string
    	Server certificate file (Enables HTTPS).
  -tlskey string
    	Private key for server certificate (used with HTTPS).
  -use_internal_server_id
    	Enables using ServerID from /varz
  -use_internal_server_name
        Enables using ServerName from /varz
  -varz
    	Get general metrics.
  -version
    	Show exporter version and exit.

The URL parameter

The url parameter is a standard url. Both http and https (when TLS is configured) is supported.

e.g. http://denver1.foobar.com:8222

Monitoring

The NATS Prometheus exporter exposes metrics through an HTTP interface, and will default to: http://0.0.0.0:7777/metrics.

When --http_user and --http_pass is used, you will need to set the username password in prometheus. See basic_auth in the prometheus configuration documentation. If using a bcrypted password use a very low cost as scrapes occur frequently.

It will return output that is readable by Prometheus.

The returned data looks like this:

# HELP gnatsd_varz_in_bytes in_bytes
# TYPE gnatsd_varz_in_bytes gauge
gnatsd_varz_in_bytes{server_id="http://localhost:8222"} 0
# HELP gnatsd_varz_in_msgs in_msgs
# TYPE gnatsd_varz_in_msgs gauge
gnatsd_varz_in_msgs{server_id="http://localhost:8222"} 0
# HELP gnatsd_varz_max_connections max_connections
# TYPE gnatsd_varz_max_connections gauge
gnatsd_varz_max_connections{server_id="http://localhost:8222"} 65536

The NATS Prometheus Exporter API

The NATS prometheus exporter also provides a simple and easy to use API that allows it to run embedded in your code.

Import the exporter package

    // import the API like this
    import (
      "github.com/nats-io/prometheus-nats-exporter/exporter"
    )

API Usage

In just a few lines of code, configure and launch an instance of the exporter.

	// Get the default options, and set what you need to.  The listen address and port
	// is how prometheus can poll for collected data.
	opts := exporter.GetDefaultExporterOptions()
	opts.ListenAddress = "localhost"
	opts.ListenPort = 8888
	opts.GetVarz = true
	opts.NATSServerURL = "http://localhost:8222"

	// create an exporter instance, ready to be launched.
	exp := exporter.NewExporter(opts)

	// start collecting data
	exp.Start()

	// when done, simply call Stop()
	exp.Stop()

	// For convenience, you can block until the exporter is stopped
	exp.WaitUntilDone()

Monitoring Walkthrough

For additional information, refer to the walkthrough of monitoring NATS with Prometheus and Grafana. The NATS Prometheus Exporter can be used to monitor NATS Streaming as well. Refer to the walkthrough/streaming documentation.

prometheus-nats-exporter's People

Contributors

andriy-bulynko avatar andy-g avatar caarlos0 avatar colinsullivan1 avatar dependabot[bot] avatar emaildanwilson avatar esotsal avatar fabjan avatar friedrichwilken avatar gcolliso avatar jarema avatar lammel avatar matthiashanel avatar mcosta74 avatar mfaizanse avatar nsurfer avatar piotrpio avatar psbarrales avatar ramonberrutti avatar rayjanoka avatar raypinto avatar renanberto avatar samuel-form3 avatar samuelattwood avatar skoef avatar twang2218 avatar variadico avatar victorperezform3 avatar wallyqs avatar will2817 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prometheus-nats-exporter's Issues

Unable to disable ssl validation

EDITED: I'm running the exporter in an isolated environment running version 0.6.0. I connect to my nats server through a reverse proxy that only exposes an https endpoint, but the nats server has a cert signed by a private CA. I don't see an option to disable cert validation, and looking through the code, I see the following comment at collector/colector.go line 272, which I take to refer to the feature I'm looking for:

// TODO: Potentially add TLS config in the transport.

It would be nice to have this, but even the ability to disable cert validation would be a help. I'm not a golang dev, but with a little googling I came up with this change. It works for me, but I don't know if this is the proper way to do it.

This is a diff on the master branch

diff --git a/collector/collector.go b/collector/collector.go
index fb00ad2..9c2e7f9 100644
--- a/collector/collector.go
+++ b/collector/collector.go
@@ -18,6 +18,7 @@ import (
        "encoding/json"
        "io/ioutil"
        "net/http"
+       "crypto/tls"
        "strings"
        "sync"
        "time"
@@ -270,6 +271,7 @@ func (nc *NATSCollector) initMetricsFromServers(namespace string) {
 func newNatsCollector(system, endpoint string, servers []*CollectedServer) prometheus.Collector {
        // TODO:  Potentially add TLS config in the transport.
        tr := &http.Transport{}
+       //Obtained from https://stackoverflow.com/a/12122718/2036650
+       tr.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
        hc := &http.Client{Transport: tr}
        nc := &NATSCollector{
                httpClient: hc,

Example Usage With Grafana

Provide instructions to start a Grafana + Prometheus + NATS exporter stack, and load a NATS dashboard to demonstrates this exporter. Include steps to install, run, and screenshots.

Streamline walkthrough

  • Update the prometheus command line to ensure an error occurs if not executed the walkthrough directory.
  • Add notes about the default Grafana admin credentials for convenience

Missing decimals on metrics

I'm actually testing nats-streaming so I've added the prometheus exporter to monitor the bus.

Everything was fine for the chan_msgs_total before reaching 1 Million msg sent to the bus.
The exporter does not export metrics with decimals when using exposant form:

nss_chan_msgs_total{channel="helloworld",server_id="http://localhost:8222"} 1e+06

Expecting value like 1.2295494e+06.

I didn't find in the code where I can fix this.

Use gometalinter

Ensure code passes gometalinter (with potential filters, etc), and add it to Travis CI.

Proposal: Add server info metric (nss_server_info)

We found that it would be extremely useful to expose more of the information provided by the serverz monitoring endpoint via the metrics mechanism. Namely, information such as "cluster_id", "server_id", "version", "go_version", "state", "role", and "start_time".

exporter should stop responding if nats is down

We hava had outages in some staging servers because nats was down but the exporter still was successfully scraped so it was considered up by prometheus. Most (if not all) exporters I use just stops responding so that scraping fails and alerts are triggered.

NATS metrics are missing (docker compose)

Hi,

I have both NATS server 2.0.0 and exporter 0.4.0 running on the same server. However, the metrics reported from exporter doesn't seem to have variables with prefix 'gnatsd_varz' and thus the dashboard shows nothing.

The following is my docker compose setting. Is there anything I missed? Any suggestion is appreciated.
varz.json.txt
metrics.txt

services:
  nats-server:
    command:
      - "-p"
      - "4222"
      - "-m"
      - "8222"
      - "-cluster"
      - "nats://0.0.0.0:6222"
    ports:
      - 4222:4222
      - 6222:6222
      - 8222:8222
    image: nats:latest
    container_name: nats-server
  prometheus-nats-exporter:
    image: natsio/prometheus-nats-exporter
    hostname: prometheus-nats-exporter
    command: "-varz http://0.0.0.0:8222"
    ports:
      - "7777:7777"
  prometheus:
    image: prom/prometheus:latest
    hostname: prometheus
    volumes:
      - "./prometheus.yml:/etc/prometheus/prometheus.yml"
    ports:
      - "9090:9090"
  grafana:
    image: grafana/grafana
    hostname: grafana
    ports:
      - "3000:3000"

QUESTION Is posible to add custom metrics prefix or custom label?

My dev-ops team manage the central prometheus and use configuration that requires we provide custom match prefix for my components metrics. My dev team need to set custom prefix for all our products components. Is possible do this for the exporter or set custom label?

Leaked credentials at the `server_id` label

When the nats url contains a user and password (because nats server authentication is enabled), credentials are reported as part of the metric server_id label. It's true that we can specify a tag to override the nats url at the server_id label, but in my particular use case I want to report the nats url and not a tag, and I don't want to expose my credentials.

General Availability

Open this repository up to the public.

This will require:

  • Example with Grafana, include dashboard, installation steps
  • Ensure acceptable code coverage (85-90%)
  • Review of code, tests (#5)
  • Thorough review of the README
  • Identify and resolve any critical issues or missing functionality (TLS).
  • Cross compile, package generation for release.

Nats Streaming server server_id is empty in Metrics data

Nats Endpoint output
curl http://localhost:8222/streaming/serverz
Response :
{
"cluster_id": "test-cluster",
"server_id": "iZd5VFnDFjeY49HNP8kAiD",
"version": "0.17.0",
"go": "go1.13.7",
"state": "STANDALONE",
"now": "2020-05-07T14:26:46.098524789Z",
"start_time": "2020-05-06T08:18:33.001460644Z",
"uptime": "13h24m6s",
"clients": 1,
"subscriptions": 0,
"channels": 1,
"total_msgs": 4277,
"total_bytes": 11579858,
"in_msgs": 4277,
"in_bytes": 11892206,
"out_msgs": 0,
"out_bytes": 0,
"open_fds": 20,
"max_fds": 1048576
}

Metrics endpoint output

HELP gnatsd_varz_auth_timeout auth_timeout

TYPE gnatsd_varz_auth_timeout gauge

gnatsd_varz_auth_timeout{server_id=""} 1

HELP gnatsd_varz_connections connections

TYPE gnatsd_varz_connections gauge

gnatsd_varz_connections{server_id=""} 4

Support to enviroments and config file. (Docker)

Hello,

Will be very nice if the docker image could support environments and also a configuration file.

It is very more easy to configure the container with those environments or a configuration file instead of use a lot of a usage of many arguments.

Thanks

use of closed network connection when stopping the server

Currently would only show up when running with debug mode:

^C[24345] 2019/07/11 14:04:04.173692 [DBG] Stopping.
[24345] 2019/07/11 14:04:04.173852 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173888 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173900 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173911 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173940 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173963 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.173975 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.174161 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.174172 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection
[24345] 2019/07/11 14:04:04.174182 [DBG] Unable to start HTTP server (may already be running): accept tcp [::]:7777: use of closed network connection

Support boolean types

Currently boolean types are ignored when setting up the prometheus metrics. Add support for this.

Export connection info

What: Export information about individual connections

Why: To allow prometheus alerting on slow clients and track bandwidth usage by client.

How: Something like this:

gnatsd_connz_out_bytes{name="xxx", ip="host", port="port", cid="yyy"}
gnatsd_connz_in_bytes{name="xxx", ip="host", port="port", cid="yyy"}
... etc

SNMP exporter uses similar pattern:

ifHCOutOctets{ifAlias="",ifDescr="Unit: 1 Slot: 0 Port: 27 10G - Level",ifIndex="27",ifName="Te1/0/27"} 48839
ifInErrors{ifAlias="",ifDescr="Unit: 1 Slot: 0 Port: 23 10G - Level",ifIndex="23",ifName="Te1/0/23"} 0

Prometheus recommended changes

  • Prometheus does not endorse the use of labels; remove label usage from documentation.
  • Prometheus requires a 1:1 paring of an exporter/server.
  • Use ConstMetrics rather than GaugeVec and CounterVec. This can be deferred until later.
  • Remove PrometheusMetricConfig

See prometheus/docs#748.

proposal: refactory to use less/no reflection

Hi everyone, thanks for the great work on NATS and on this exporter!

So, I was looking in issues like #37 and #39, and I was looking forward into implementing it, but I found some issues that, while not blockers, would make the code smell bad.

The exporter currently uses reflection to determine the metrics, so, we have a generic exporter, to which we pass an endpoint and it exposes all metrics as gauges.

Some problems in there:

  • we may have metrics better exposed as other types (eg: counters)
  • the metrics don't have the HELP statement

This is kind of OK for the current scope, but, if we are going to expose streaming metrics as well, this will be a bigger issue.

On the reflection side, for example, if we want channel subscription metrics (streaming/channelsz?subs=1), the actually metrics are under subscriptions on the json, not in the root, so we will need to keep "going down" until we reach the actual metrics.

My proposal is:

  • get rid of the reflection in favor of a simpler solution, parsing the JSON output into structs and etc
  • add the missing HELPs
  • finally, start adding the streaming metrics

We can still have plenty of "common code" (http request and etc), we just will have more declarative prometheus.Desc and the parsing will be made to the appropriate structs instead of map[string]interface{}.

What do you think?
If you agree, I can work on a prototype...

Support RTT metrics

Is it possible to get RTT ?
Or would you tell me the other way to get data like it ?

[Request] Add a health check endpoint

We're deploying an instance of the metrics exporter to K8s and it would be useful to have a healthz endpoint that just returns 200.

We're hitting the metrics endpoint at the moment but I suspect that causes a whole lot of unnecessary work to be done for it to be ignored by K8s.

NATS Streaming Server has more than 1024 channels yet only 1024 channels worth of metrics are exported

URL: s.URL + "/streaming/channelsz?subs=1",

When I hit the streaming/channelsz monitoring endpoint myself I need to add query parameter limit=10000 or similar to see all of my channels. Yet it seems here in the exporter code that we don't specify a limit and therefore only get the first 1024.

Please advise. Would it make sense to add an argument to configure this limit when starting up the exporter?

An error has occurred during metrics gathering

After updating our stateful set in kubernetes the exporter stopped working with the following error:

An error has occurred during metrics gathering:

3 error(s) occurred:
* collected metric nss_chan_subs_last_sent label:<name:"channel" value:"shop.feature.disabled" > label:<name:"client_id" value:"" > label:<name:"server" value:"http://localhost:8222" > gauge:<value:353 >  was collected before with the same name and label values
* collected metric nss_chan_subs_pending_count label:<name:"channel" value:"shop.feature.disabled" > label:<name:"client_id" value:"" > label:<name:"server" value:"http://localhost:8222" > gauge:<value:0 >  was collected before with the same name and label values
* collected metric nss_chan_subs_max_inflight label:<name:"channel" value:"shop.feature.disabled" > label:<name:"client_id" value:"" > label:<name:"server" value:"http://localhost:8222" > gauge:<value:1024 >  was collected before with the same name and label values

We are running nats-streaming v0.14.0 with the exporter as sidecar.

Incorrect Nats Streaming Metrics Types

Going to localhost:7777/metrics, the nss_server_bytes_total proclaims itself to be a counter

# HELP nss_server_bytes_total Total of bytes
# TYPE nss_server_bytes_total counter

And yet it can actually go down as messages are deleted from the servers due to old age (which is expected)
Screen Shot 2019-04-19 at 12 40 05 PM
Am I doing something silly or are the metric types weird (and if they are weird, there might be others)?

Missing maintainer in Debian package

The debian package is missing a maintainer and hence causes warnings for dpkg processing.

Warnings appear on every dpkg/apt run like this:

dpkg-query: warning: parsing file '/var/lib/dpkg/status' near line 76605 package 'prometheus-nats-exporter':
 missing maintainer

As seen from the debian package for v0.1.0 the maintainer is empty:

sh# dpkg -I /tmp/prometheus-nats-exporter-v0.1.0-linux-amd64.deb 
 new Debian package, version 2.0.
 size 2097286 bytes: control archive=386 bytes.
       1 bytes,     0 lines      conffiles            
     278 bytes,    16 lines      control              
      72 bytes,     1 lines      md5sums              
 Package: prometheus-nats-exporter
 Version: 0.1.0
 Section: 
 Priority: 
 Architecture: amd64
 Maintainer: 
 Vendor: nats.io
 Installed-Size: 5375
 Replaces: 
 Provides: 
 Depends: 
 Recommends: 
 Suggests: 
 Conflicts: 
 Homepage: https://nats.io
 Description: A Prometheus exporter for NATS

Please add a maintainer string to the .goreleaser.yml file in the nfpm section.
For example use the first maintainer from MAINTAINER.md:

nfpm:
  vendor: nats.io
  maintainer: Colin Sullivan <[email protected]>
  ...

(no pull request provided, as there is single maintainer listed in MAINTAINER.md)

support custom labels to be set through command line flag

It's really convenient for devops to monitor and manage different nats clusters on the cloud if custom labels are supported.

There is a related issue (#86) about this concern, but only prefix is supported.

My suggestion is using another flag to make users set what labels to be added for all metrics, where these labels are all constant labels.

An error has occurred during metrics gathering: ... was collected before with the same name and label values

We get an error when prometheus-nats-exporter is gathering duplicate metrics:

An error has occurred during metrics gathering:

33 error(s) occurred:
* collected metric nss_chan_subs_last_sent label:<name:"channel" value:"AAAAA" > label:<name:"client_id" value:"XYZ" > label:<name:"durable_name" value:"TBHT" > label:<name:"inbox" value:"_INBOX.y8DGMCCCCCrZ" > label:<name:"is_durable" value:"true" > label:<name:"is_offline" value:"false" > label:<name:"queue_name" value:"TBHT" > label:<name:"server_id" value:"http://localhost:8223" > gauge:<value:0 >  was collected before with the same name and label values

As I read this issue should be fixed since version 0.2.2 (see issue #67 ).

We are using prometheus-nats-exporter in version 0.6.0 with the official docker image and the following options in a kubernetes pod on Azure AKS:

"-varz", "-connz", "-routez", "-subz", "-channelz", "-serverz", "http://localhost:8222", "http://localhost:8223"

The exporter is providing the metrics for the nats and nats-streaming container running on the same pod.

Support Array/Vector type of metrics

Support arrays/vectors, such as the list of individual connections and subscriptions. One approach may be to create a deterministic namespace of metrics to represent individual elements, but it is worthwhile to first look into what other prometheus exporters do with these types data.

Zipkin like tracing

Is it possible to get zipkin like tracing into Prometheus from exporter.

Please suggest.

Please provide a gnatsd_varz_server_info metric

To get a full info about our server infrastructure we would like to have the version and server_name, server_id values in a dedicated metric. Currently we are getting those information from nats-surveyor metrics only. It would be useful to detect some version inconsistencies.

Cant resolve the dns name

Hi guys!
I'm deploying NATS exporter in Openshift environment and I'm getting an error message:
Error loading metric config from response: Get "http://nats-cluster.dummy1-prd01-tontin.svc.cluster.local:8222/varz": dial tcp: lookup nats-cluster.dummy1-prd01-tontin.svc.cluster.local on 10.64.61.101:53: no such host

I run a curl from another pod in the same NameSpace where nats exporter is running and works perfectly so NATS is exposing the metrics.

Any ideas how I can fix this issue?

thanks in advance!

PD: I have the NATS exporter running in other different environments (no k8s) and is working fine so far.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.