Git Product home page Git Product logo

Comments (8)

nobuto-m avatar nobuto-m commented on August 17, 2024

Looks like those failures are every 5 min and it matches with the interval of the update-status hook.

pods.log

from alertmanager-k8s-operator.

nobuto-m avatar nobuto-m commented on August 17, 2024

Hmm, scratch that. dial tcp 127.0.0.1:5001: connect: connection refused is still happening every 5 min even after setting update-status-hook-interval=30m.

2024-03-18T13:38:41.800801283Z stdout F 2024-03-18T13:38:41.800Z [alertmanager] ts=2024-03-18T13:38:41.800Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"alertmanager\", juju_model=\"cos\", juju_model_uuid=\"4ccf0ff7-981f-45eb-86d9-4c6f0b922527\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:38:42.215479478Z stdout F 2024-03-18T13:38:42.215Z [container-agent] 2024-03-18 13:38:42 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)
2024-03-18T13:43:41.8041126Z stdout F 2024-03-18T13:43:41.803Z [alertmanager] ts=2024-03-18T13:43:41.803Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="placeholder/webhook[0]: notify retry canceled after 16 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:43:41.804073306Z stdout F 2024-03-18T13:43:41.803Z [alertmanager] ts=2024-03-18T13:43:41.803Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="placeholder/webhook[0]: notify retry canceled after 16 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:43:41.804506534Z stdout F 2024-03-18T13:43:41.804Z [alertmanager] ts=2024-03-18T13:43:41.804Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"alertmanager\", juju_model=\"cos\", juju_model_uuid=\"4ccf0ff7-981f-45eb-86d9-4c6f0b922527\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:43:41.804520861Z stdout F 2024-03-18T13:43:41.804Z [alertmanager] ts=2024-03-18T13:43:41.804Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"microk8s\", juju_model=\"cos-microk8s\", juju_model_uuid=\"b96b05ee-afa6-46fd-8ec7-02ca7528a5d9\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:48:41.8047974Z stdout F 2024-03-18T13:48:41.804Z [alertmanager] ts=2024-03-18T13:48:41.804Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="placeholder/webhook[0]: notify retry canceled after 17 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:48:41.804756764Z stdout F 2024-03-18T13:48:41.804Z [alertmanager] ts=2024-03-18T13:48:41.804Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="placeholder/webhook[0]: notify retry canceled after 16 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:48:41.805093239Z stdout F 2024-03-18T13:48:41.805Z [alertmanager] ts=2024-03-18T13:48:41.804Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"microk8s\", juju_model=\"cos-microk8s\", juju_model_uuid=\"b96b05ee-afa6-46fd-8ec7-02ca7528a5d9\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:48:41.805115331Z stdout F 2024-03-18T13:48:41.805Z [alertmanager] ts=2024-03-18T13:48:41.804Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"alertmanager\", juju_model=\"cos\", juju_model_uuid=\"4ccf0ff7-981f-45eb-86d9-4c6f0b922527\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:53:41.8056064Z stdout F 2024-03-18T13:53:41.805Z [alertmanager] ts=2024-03-18T13:53:41.805Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="placeholder/webhook[0]: notify retry canceled after 16 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:53:41.80602439Z stdout F 2024-03-18T13:53:41.805Z [alertmanager] ts=2024-03-18T13:53:41.805Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"microk8s\", juju_model=\"cos-microk8s\", juju_model_uuid=\"b96b05ee-afa6-46fd-8ec7-02ca7528a5d9\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:53:41.805644461Z stdout F 2024-03-18T13:53:41.805Z [alertmanager] ts=2024-03-18T13:53:41.805Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="placeholder/webhook[0]: notify retry canceled after 17 attempts: Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"
2024-03-18T13:53:41.805995605Z stdout F 2024-03-18T13:53:41.805Z [alertmanager] ts=2024-03-18T13:53:41.805Z caller=notify.go:745 level=warn component=dispatcher receiver=placeholder integration=webhook[0] aggrGroup="{}:{juju_application=\"alertmanager\", juju_model=\"cos\", juju_model_uuid=\"4ccf0ff7-981f-45eb-86d9-4c6f0b922527\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"<redacted>\": dial tcp 127.0.0.1:5001: connect: connection refused"

from alertmanager-k8s-operator.

nobuto-m avatar nobuto-m commented on August 17, 2024

oh...

root@alertmanager-0:/# cat /etc/alertmanager/alertmanager.yml 
global:
  http_config:
    tls_config:
      insecure_skip_verify: false
receivers:
- name: placeholder
  webhook_configs:
  - url: http://127.0.0.1:5001/
route:
  group_by:
  - juju_application
  - juju_model_uuid
  - juju_model
  group_interval: 5m
  group_wait: 30s
  receiver: placeholder
  repeat_interval: 1h

from alertmanager-k8s-operator.

sed-i avatar sed-i commented on August 17, 2024

Hi @nobuto-m,
Yes, this is coming from the placeholder receiver.
Alertmanager won't start without this config.
You would need to provide your own "real" config via a charm config option.

from alertmanager-k8s-operator.

nobuto-m avatar nobuto-m commented on August 17, 2024

How exactly? I didn't see a relevant topic in the documentation and config.
https://charmhub.io/topics/canonical-observability-stack
https://charmhub.io/alertmanager-k8s/configuration

from alertmanager-k8s-operator.

simskij avatar simskij commented on August 17, 2024

How exactly? I didn't see a relevant topic in the documentation and config. https://charmhub.io/topics/canonical-observability-stack https://charmhub.io/alertmanager-k8s/configuration

It's linked in the description of the config_file property on the second page you linked. https://www.prometheus.io/docs/alerting/latest/configuration/

from alertmanager-k8s-operator.

nobuto-m avatar nobuto-m commented on August 17, 2024

I mean do operators have to write the whole config of alertmanager.yml just to specify where to send alerts? Do they have to know the following trick without documentation?

  group_by:
  - juju_application
  - juju_model_uuid
  - juju_model

from alertmanager-k8s-operator.

simskij avatar simskij commented on August 17, 2024

I mean do operators have to write the whole config of alertmanager.yml just to specify where to send alerts? Do they have to know the following trick without documentation?

  group_by:

  - juju_application

  - juju_model_uuid

  - juju_model

Yes, that's how it works. As for the group by, this is injected automatically without the user needing to supply it.

We are looking to provide some common config examples in the docs in the future, but atm that's how it is.

from alertmanager-k8s-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.