cloudflare / unsee Goto Github PK

Alert dashboard for Prometheus Alertmanager

License: Apache License 2.0

Makefile 1.32% Go 54.14% JavaScript 27.25% CSS 2.50% HTML 13.17% Shell 0.19% Python 1.43%

monitoring alerting dashboard prometheus alertmanager

unsee's Issues

Support alertmanager 0.15.x

I'm using unsee against AM 0.15.1, and a few things look weird.
Can I verify myself if the new alertmanager is the culprit, or assist in debugging it?

The exact problem I'm experiencing: the alertname doesn't show in the header of an "incident block", it's promoted (or demoted) to the listing inside the incident block.

Upgrade moment.js

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-18214

How to add Alert manager name for each instance

We have 3 Alert managers running in Prod/Preprod/QA, we would like to differentiate with specific names to it, Please let us know where to set Alertmanager names. Right now when we use @alertmanager, we see the name as " default"

The reason we use Unsee is to see all Alerts from 3 alertmanagers but now we are unable to differentiate alertmanager names as it displays everything as default. Please advise. Thanks

Provide link to alertmanager silence

When looking at a silenced alert it would be nice to be able to get a link to the silence in alertmanager.

If nothing else it would be acceptable to be able to get the silence id to search in amtool

[::1]:53: read: connection refused

Hi,

Sometimes I get this error in the unsee UI

Get http://prometheus-alertmanager-xxxxx/api/v1/silences: dial tcp: lookup prometheus-alertmanager-xxxxx on [::1]:53: read udp [::1]:55668->[::1]:53: read: connection refused

I have to delete the UNSEE pods to make it disappear.

using nginx

is there a specific flag I'm supposed to use to get it working with nginx ?

I'm trying to setup a container with alertmanager and unsee
when I hit /unsee or /unsee I get a 404

I run unsee with the following ...

/usr/bin/unsee --listen.port 9095 --alertmanager.uri http://127.0.0.1:9094 --listen.prefix /unsee

This is my nginx file ...

server {
    listen 9093;

    location / {
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Host $http_host;
        proxy_set_header X-NginX-Proxy true;

        proxy_pass http://127.0.0.1:9094;
        proxy_redirect off;

        # Socket.IO Support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

server {
    listen 9093;

    location /unsee {
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Host $http_host;
        proxy_set_header X-NginX-Proxy true;

        proxy_pass http://127.0.0.1:9095;
        proxy_redirect off;

        # Socket.IO Support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Panic in semantic version parsing

Debian sometimes like to append git commit IDs to package versions, and this is resulting in a panic when Unsee tries to parse the version:

INFO[0000] GET http://fra-alertmgr:9093/api/v1/status timeout=40s 
INFO[0000] [default] Remote Alertmanager version: 0.15.0~rc.1~git20180507.28967e3+ds 
panic: semver: Parse(0.15.0~rc.1~git20180507.28967e3+ds): Invalid character(s) found in patch number "0~rc.1~git20180507.28967e3"

goroutine 11 [running]:
github.com/cloudflare/unsee/vendor/github.com/blang/semver.MustParse(0xc420025440, 0x22, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/travis/gopath/src/github.com/cloudflare/unsee/vendor/github.com/blang/semver/semver.go:319 +0x20c
github.com/cloudflare/unsee/internal/mapper/v04.SilenceMapper.IsSupported(0x0, 0x0, 0xc420025440, 0x22, 0xc4201066a0)
        /home/travis/gopath/src/github.com/cloudflare/unsee/internal/mapper/v04/silences.go:67 +0x70
github.com/cloudflare/unsee/internal/mapper.GetSilenceMapper(0xc420025440, 0x22, 0xc42002e500, 0xc420065418, 0xc420065420, 0xc4200907d0)
        /home/travis/gopath/src/github.com/cloudflare/unsee/internal/mapper/mapper.go:59 +0xa1
github.com/cloudflare/unsee/internal/alertmanager.(*Alertmanager).pullSilences(0xc420161110, 0xc420025440, 0x22, 0x0, 0x0)
        /home/travis/gopath/src/github.com/cloudflare/unsee/internal/alertmanager/models.go:107 +0x79
github.com/cloudflare/unsee/internal/alertmanager.(*Alertmanager).Pull(0xc420161110, 0x23, 0xc420040f90)
        /home/travis/gopath/src/github.com/cloudflare/unsee/internal/alertmanager/models.go:300 +0x6d
main.pullFromAlertmanager.func1(0xc420026f20, 0xc420161110)
        /home/travis/gopath/src/github.com/cloudflare/unsee/timer.go:25 +0xab
created by main.pullFromAlertmanager
        /home/travis/gopath/src/github.com/cloudflare/unsee/timer.go:23 +0x10e

I realize that this is technically not a bug in Unsee, rather in blang/semver, but perhaps Unsee could offer a workaround?

Document alerts showing briefly until silence cycle applied

We should document prometheus/alertmanager#609.

Option to remove summary or description

Hey there - thanks for unsee, just started using and it's great.

One question: is it possible to remove either alert description or summary from the dashboard? Would clear up a good amount of dashboard space. For our purposes we don't need both.

Thanks!

Can't use comma in filters

Because , is used to separate filters it's impossible to use it as a filter value.

Feature: Take username from http header

Would be great to enable taking the username from a HTTP header.

Support Alertmanager with TLS Auth?

Dear all,

I have many Alertmanager and somes with TLS auth.
Is it possible to use a client cert in Unsee to scrape alerts?

Regards

Source/generator URL has misleading label

The 'generator URL' which links to a graph in Prometheus UI is currently labelled according to the Alertmanager instance.

This is misleading, since the Prometheus instance that triggered the alert does not necessarily map to a given Alertmanager.

Moreover, it makes it hard to find the link. Anecdotally, most people using Unsee don't seem to be aware that you can click on the name of the Alertmanager instance and see the data that triggered an alert.

Silences POST fails with 307 Temporary Redirect due to double-slash in URL

When trying to create a silence in Unsee, the POST fails because it has a double-slash in the URL, causing Alertmanager to respond with a 307 Temporary Redirect.

For example, the initial POST goes to http://ewr-alertmgr:9093//api/v1/silences and fails with the 307. If I edit and resend the POST using Firefox developer console, with the extra slash removed, it succeeds.

Tested with Unsee 0.9.2 and Alertmanager 0.15.0-rc1.

Credentials in URLs are not masked

Hi,

When an alertmanager is not available for some reasons, unsee UI starts showing a warning on the screen. The warning contains the full URL including the unmasked credentials (username/password) used for Basic Auth.

It would be nice if we can get the credentials part masked to not leak credentials that easily (e.g. the unsee UI is shown on TVs in our offices). Same should be the case for any outputs (logs etc.)

-- Sven

[0.9.1] Silence on Proxied Alertmanagers returns 404

Hi,

I configured that requests from browsers to our Alertmanagers will be proxied via unsee. But when looking at the developer console when trying to create a new silence in the UI, i see that the POST request to the proxy-endpoint returns a 404. We are using unsee 0.9.1.

request:

POST https://unsee.mydomain.com/proxy/alertmanager/test/api/v1/silences 404

snippet form the config:

...
servers:
  - name: test
    uri: https://username:[email protected]
    timeout: 30s
    proxy: true
  ...
  listen:
    address: "0.0.0.0"
    port: 8080 # port in Docker container (we are running unsee in k8s)
    prefix: /
  ...

When triggering the silencing requests, i can see the following debug log:

time="2018-04-12T15:50:02Z" level=debug msg="[test] Proxy request for /api/v1/silences"

Any idea what causes the issue or how i can debug this?

Thx a lot,
Sven

drill alerts from more than one alertmanager?

we have alertmanager instance inside each of our environments, if unsee can get data from all of them and display them together, that will be great.

Support undo/redo actions for query editing

This is a fairly minor request, but when manually editing the query field- say to add alertname=My_Stupid_Alert, due to the JS manipulations that occur undo/redo logic isn't possible. Specifically since the text is converted to a different object, the browser cannot 'undo' that action.

When one screws up and removes long parts of the query, this is a bit annoying.

I only have a surface understanding of the DOM/events involved in this, but http://mattjmattj.github.io/simple-undo/ is an example that can likely be examined for which hooks we'd need to file. From there something like https://github.com/ArthurClemens/Javascript-Undo-Manager/blob/master/lib/undomanager.js gives an object approach for managing the queue of changes, and providing a 'history' that the undo/redo can act upon.

Add tls certs to the "curl" example command for silencing

This line...

unsee/assets/static/silence.js

Line 95 in 4c306c0

d.push("curl " + alertmanagerSilencesAPIUrl(uri));

 d.push("curl " + alertmanagerSilencesAPIUrl(uri));

Produces a curl command like this, which is super handy!

curl https://alertmanager1.internal.net:9093//api/v1/silences
  -X POST --data {
  "matchers": [
    {
      "name": "alertname",
      "value": "SshProbeFailing",
      "isRegex": false
    }
  ],
  "startsAt": "2018-08-23T21:34:50.845Z",
  "endsAt": "2018-08-23T22:34:50.989Z",
  "createdBy": "[email protected]",
  "comment": "I'm fixing this..."
}

But, it doesn't include the TLS settings (e.g. from the config file or commandline args). If TLS options are included, it should look more like this...

curl https://alertmanager1.internal.net:9093//api/v1/silences --cert /var/certs/cert.pem --key /var/certs/pkey.pem
  -X POST --data {
  "matchers": [
    {
      "name": "alertname",
      "value": "SshProbeFailing",
      "isRegex": false
    }
  ],
  "startsAt": "2018-08-23T21:34:50.845Z",
  "endsAt": "2018-08-23T22:34:50.989Z",
  "createdBy": "[email protected]",
  "comment": "I'm fixing this..."
}

Support for severity levels?

Some other monitoring systems have a concept of severity level for alerts, e.g. a high_memory_usage alert is much less important than a service_is_down alert. A way to differentiate (color, shape, size, etc.) in unsee to support severity level would be very helpful in case you have a lot of alert noise and wanted to be sure you won't miss an important alert somewhere. Or some ordering based on severity could help if you want to optimize your works on eliminating them.

A simple implementation would be allow user to config one label to be used as severity level and adding UI support for ordering based on that label (or even better, order based on any label).

Allow for a web prefix

It would be nice to be able to serve unsee from a given web prefix - so that it's easy peasy to put behind a proxy-pass'ing web server.

Docker Image Unsupported URI scheme

Docker Images from 0.7.1 to 0.8.0

I get:

prometh_unsee.1.pow67kacmekn@dallpdsm93050u    | time="2018-01-22T23:03:34Z" level=error msg="[http] //alertmanager:9093/api/v1/status request failed: Unsupported URI scheme '' in '//alertmanager:9093/api/v1/status'"

In my compose file I've tried:

environment:
      ALERTMANAGER_URIS: "http://alertmanager:9093"

environment:
      ALERTMANAGER_URIS: 'http://alertmanager:9093'

environment:
      ALERTMANAGER_URIS: alertmanager:9093"

Silence comments only appear after a while

After silencing an alert via the unsee dashboard I don't see the comment immediately. As a matter of fact I only see the comment after silence expiration. What am I missing? I'm using v0.9-14.

WEB_PREFIX ENV not working

Using a docker-compose with the
image: cloudflare/unsee:latest
environment:
WEB_PREFIX: /prefix/

didn't working.
When looking for logs with

docker logs unsee

see that prefix is

msg=" prefix: /"

How to configure for HA Alertmanagers?

We should determine and document how to configure Unsee when running Alertmanager in a highly-available (HA) setup.

Version 0.5.x of Alertmanager introduces HA capability, which roughly works as follows:

Prometheus must be configured to send alerts to all instances (Alertmanager instances do not share alert data between them)
Alertmanager shares silences and notification events between instances in the HA group using the Mesh gossip library
Alertmanager will avoid sending a notification if one of its peers has already sent a notification for the same alert

That model works well for sending notifications (since you should receive a notification if it reached at least one of the peers), but less well for API queries since each Alertmanager instance may have a differing view of what alerts are currently firing.

Seems to me that the options are:

Rely on one instance and accept that some alerts may be missing - if they're severe enough, we should be paged for them anyway.
Try to poll all Alertmanager instances and merge the results.

My intuition says that option 1 is by far the most preferable, for simplicity's sake. If we agree, then we should document that approach.

Feature: HTML safe annotations

Would be greate to mark some annotations HTML safe.

can't build from master due to bindata_assetfs problem

Cloned master. Trying "make run". Got:

go-bindata-assetfs -prefix assets -nometadata assets/templates/... assets/static/...
make: go-bindata-assetfs: Command not found
Makefile:41: recipe for target 'bindata_assetfs.go' failed
make: *** [bindata_assetfs.go] Error 127

Cannot unmarshal silencesData

INFO[0360] GET http://monitoring-1:9093/api/v1/silences?limit=4294967295
ERRO[0360] json: cannot unmarshal array into Go struct field SilenceAPIResponse.data of type alertmanager.silencesData

Feature: Export a text summary of an alert

I often see people taking screenshots of alerts in Unsee and adding them to JIRA tickets to show what alert fired when they were investigating an issue. It's difficult to copy information from a screenshot, so it seems it would be useful to have a way to copy the alert details as text.

One solution is to send alerts directly to JIRA (we do that already), but for alerts not sent to JIRA, I think this would be useful. The same information can be copied from amtool but I think people are unlikely to jump from their browser to their shell to do that.

Maybe we should just provide a link to the alert page in Alertmanager where the information can be copy-pasted.

Rewrite UI using react

It's officially 2018, time for a cleaner UI code.
It should also use SSE or websockets instead of AJAX polling, which will require backend code changes.

Feature: Link to silence page when adding a silence

As of #206, the silence ID is shown in the UI when a silence is added.

Suggest that the silence ID links to the Alertmanager silence page, e.g.:

https://alertmanager.example.com/#/silences/1234-1234-1234-1234

Rename @status to @state

It's time to release unsee 0.5, but there is one thing might need changing first, @status filter was added to the master branch to support new status key from Alertmanager >=0.6.1
status ended up being nested in Alertmanager (it was added to solve AM issue 609 and that was a long PR with lots of changes), current unsee implementation ended being slightly off with how Alertmanager is naming this, it should actually be @state rather than @status.

"status": {
    "inhibitedBy": [],
    "silencedBy": [],
    "state": "active"
}

example

We should rename it before releasing 0.5, any objections @jamesog / @mattbostock / @Tenzer ?

Handling different Alertmanager API variants

Alertmanager API isn't stable yet, which means that 0.4 branch API is incompatible with 0.5.
Let's cut 0.1 unsee branch that targets Alertmanager 0.4 and 0.2 branch that targets 0.5 API.
With that we can release v0.1.0 of the 0.1 branch and start work on v0.2.0.
After that let's update README and document which unsee one needs for given Alertmanager version.

unsee_collected_alerts should include receiver

From #128#issuecomment-313331723

Would it be an idea to have the unsee_collected_alerts split into which receiver they are for? As it is right now it's just a total number of all alerts received for the system, but adding the receiver as a dimension would allow different teams using the same Alertmanager instance to dig into how many alerts they currently have going off.

Seems like a good idea, but current code doesn't make it easy to add.

Support multiple Alertmanager versions

We can make unsee support multiple versions by first checking remote Alertmanager version via GET /api/v1/status, reading the version and using the correct model for response handling.

"Internal error TypeError Alerts is undefined"

Unsee does not work on our dashboard TVs (running PlayIPP), with the error message "Internal error TypeError Alerts is undefined". Any idea what would cause this? Just poor javascript support in the box?

Proxy client requests through unsee instance

We've protected the alertmanager and unsee instances directly over the proxyserver.
But submitting silences will not work, because the the alertmanager is directly queried from client side.
Possibly is there any ability to proxy those requests through unsee instance to alertmanager.

Support for both private and public endpoints for alertmanager

Hi,

Here my setup

Unsee in a docker container accessible from internet through oauth2_proxy.
Alertmanager in a docker container accessible from both a private network with no authentication and from internet trough oauth2_proxy.
Unsee talks to Alertmanager via the private network.

It works well but the silence feature unfortunately tries to use this private address of the Alertmanager declared in the conf.

It would be nice if we could define for each Alertmanagers a public ip that the silence feature would use.

Cheers.

Feature: Create silences

I would like to see the ability to create silences without an alert being fired.

Silence on proxied alertmanagers give a 404

Wger I try to create a silence, I get a 404 from unseen.

It is not even making API calls to alertmanager.

Option for inverse of STRIP_LABELS

We're trying to clear up / simplify our unsee dashboard by using STRIP_LABELS - which is great. However many exporters add additional labels and adding them all to a list can be tedious.

Could a INCLUDE_LABELS option be added, where we specifically say: we want to see labels a, b, and c, but nothing else?

Thank ya.

Make this repo public

There are a few open PRs, can we open this repo once they are merged?
Are there any other outstanding issues preventing us from making it public?

Small tasks to be done once it's open:

Add mock data to this repo - pointless before it's marked public as we won't be able to access those files
Setup travis - we could use CI, travis will be easy to add
Build an official docker image and put it on docker hub ?

custom alerts grouping

Hi. Thanks for the nice tool!
We (Upwork) have just one major problem with Unsee.
In our AlertManager setup we are grouping the same alerts by different criteria in order to provide different email notifications. Specifically, we have a lot of microservices maintained by several teams. Every team maintains more than one service. Our alerts have both "service" and "team" labels. And AlertManager groups the alerts by service for daily email notification and groups the same alerts by team for weekly digest. So we have 2 nodes in our routing tree.
But Unsee groups all alerts by team only! It seems to be the first group met in AlertManager response, and Unsee just uses it. It is not convenient for us. We'd rather prefer to have the alerts grouped by service in Unsee. But unfortunately this is not configurable at all.
Have you ever considered storing all alerts received from AlertManager as a plain list without any grouping, and then allowing the user to define any "groupBy" condition he want? Perfectly if in UI directly, but in config file is OK too. It'd be just cool, at least for us.
Thanks in advance for any answer!

Links in annotations?

Hi!
thanks for unsee! I'd like to include clickable links in annotation text (e.g. link to relevant logs, dashboards, etc) is there a way to do that? AFAICS the annotation text now isn't styled in any way.

thanks!

Build ERROR: use of internal package not allowed

Trying to build with go 1.9 and i am getting this error tried 0.8 and master.

  [11] ./unsee.js 12.7 kB {0} [built] [prefetched]
  [15] ./templates.js 2.84 kB {0} [built]
  [25] ./alerts.js 5.69 kB {0} [built]
  [27] ./autocomplete.js 2.04 kB {0} [built]
  [28] ./colors.js 1.68 kB {0} [built]
  [29] ./config.js 3.45 kB {0} [built]
  [37] ./filters.js 7.63 kB {0} [built]
  [50] ./counter.js 1.81 kB {0} [built]
  [51] ./grid.js 1.43 kB {0} [built]
  [53] ./summary.js 1.82 kB {0} [built]
 [266] ./progress.js 1.03 kB {0} [built]
 [270] ./silence.js 13.5 kB {0} [built]
 [271] ./unsilence.js 3.07 kB {0} [built]
 [272] ./watchdog.js 1.71 kB {0} [built]
 [273] ./help.js 126 bytes {1} [built] [prefetched]
    + 273 hidden modules
go-bindata-assetfs  -prefix assets -nometadata assets/templates/... assets/static/dist/...
go build -ldflags "-X main.version="
alerts.go:6:2: use of internal package not allowed
main.go:10:2: use of internal package not allowed
alerts.go:7:2: use of internal package not allowed
alerts.go:8:2: use of internal package not allowed
views.go:13:2: use of internal package not allowed
main.go:11:2: use of internal package not allowed
make: *** [unsee] Error 1

Please help me. Thanks
@prymitive any thoughts

Allow a user to be able to add additional labels to a silence via alertmanager ui

It would be good to be able to specify labels that are not part of a currently firing alert so that we can reduce the number of silences that are needed.

For exampe I would like to do the following:
{ "node": "foo", "instance": "foo" }

on an alert where only instance is firing

Add support for integration tests

We need to run more strict tests than just the ones using mock files. We need to add support for integration tests that will spawn an instance of Alertmanager, generate some alerts and check if we can read those correctly.

Build unsee binary on CI and publish to Github

By popular request from @terinjokes - people want binaries, so let's have it.

Failed with AM 0.5.1

The silences format unsee is expecting seems different from what is being sent by alertmanager 0.5.1,
I see the following error.

ERRO[0000] json: cannot unmarshal array into Go struct field SilenceAPIResponse.data of type alertmanager.silencesData

The changes in #23 resolved this error. It may be that you are running against a newer version of AM.

gin request metrics contain arbitrary URLs, resulting in high cardinality

The gin_requests_total metrics output by Unsee contain the request URI, which can contain arbitrary values. This produces very high cardinality in Unsee's metrics.

The handler label should be sufficient; access logs can be used for more detailed analysis.

cloudflare / unsee Goto Github PK

unsee's Issues

Recommend Projects

Recommend Topics

Recommend Org