Git Product home page Git Product logo

Comments (6)

dstathis avatar dstathis commented on July 17, 2024 1

Thanks!

For bonus points we could detect if there would be changes to the config file and only call the reload if needed.

I believe this is a must-have, as long as alertmanager reload/restart has a chance of being triggered as part of the update-status hook.

If the reload api call works correctly, it should be a no-op when there is no config change.

from alertmanager-k8s-operator.

przemeklal avatar przemeklal commented on July 17, 2024

It seems to be restarting on the update-status hook. 10.152.183.1 is Kubernetes API service in this env.

$ kubectl logs -n cos alertmanager-0
...
2023-12-13T10:41:36.684Z [container-agent] 2023-12-13 10:41:36 ERROR juju-log Failed to obtain status: Bad response
2023-12-13T10:41:36.770Z [container-agent] 2023-12-13 10:41:36 INFO juju-log HTTP Request: GET https://10.152.183.1/apis/apps/v1/namespaces/cos/statefulsets/alertmanager "HTTP/1.1 200 OK"
2023-12-13T10:41:36.848Z [container-agent] 2023-12-13 10:41:36 INFO juju-log HTTP Request: GET https://10.152.183.1/api/v1/namespaces/cos/pods/alertmanager-0 "HTTP/1.1 200 OK"
2023-12-13T10:41:36.880Z [container-agent] 2023-12-13 10:41:36 INFO juju-log reqs=ResourceRequirements(claims=None, limits={}, requests={'cpu': '0.25', 'memory': '200Mi'}), templated=ResourceRequirements(claims=None, limits=None, requests={'cpu': '250m', 'memory': '200Mi'}), actual=ResourceRequirements(claims=None, limits=None, requests={'cpu': '250m', 'memory': '200Mi'})
2023-12-13T10:41:36.925Z [container-agent] 2023-12-13 10:41:36 INFO juju-log HTTP Request: GET https://10.152.183.1/apis/apps/v1/namespaces/cos/statefulsets/alertmanager "HTTP/1.1 200 OK"
2023-12-13T10:41:36.998Z [container-agent] 2023-12-13 10:41:36 INFO juju-log HTTP Request: GET https://10.152.183.1/api/v1/namespaces/cos/pods/alertmanager-0 "HTTP/1.1 200 OK"
2023-12-13T10:41:38.515Z [container-agent] 2023-12-13 10:41:38 WARNING juju-log config reload via HTTP POST failed: Bad response
2023-12-13T10:41:38.522Z [container-agent] 2023-12-13 10:41:38 INFO juju-log Restarting service alertmanager
2023-12-13T10:41:40.139Z [container-agent] 2023-12-13 10:41:40 WARNING juju-log cannot determine if reload succeeded
2023-12-13T10:41:40.518Z [container-agent] 2023-12-13 10:41:40 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)

from alertmanager-k8s-operator.

przemeklal avatar przemeklal commented on July 17, 2024

It seems that each update-status calls _common_exit_hook which includes this logic:

        # Reload or restart the service
        try:
            self.alertmanager_workload.reload()
        except ConfigUpdateFailure as e:
            self.unit.status = BlockedStatus(str(e))
            return

and as a result, it triggers a restart/reload every 5 minutes by default.

from alertmanager-k8s-operator.

dstathis avatar dstathis commented on July 17, 2024

It seems that each update-status calls _common_exit_hook which includes this logic:

        # Reload or restart the service
        try:
            self.alertmanager_workload.reload()
        except ConfigUpdateFailure as e:
            self.unit.status = BlockedStatus(str(e))
            return

and as a result, it triggers a restart/reload every 5 minutes by default.

As an additional note, BlockedStatus set in the code here should be an error status as there is no user action that resolves it.

from alertmanager-k8s-operator.

dstathis avatar dstathis commented on July 17, 2024

To fix this we should fix the reload API call. For bonus points we could detect if there would be changes to the config file and only call the reload if needed.

from alertmanager-k8s-operator.

przemeklal avatar przemeklal commented on July 17, 2024

Thanks!

For bonus points we could detect if there would be changes to the config file and only call the reload if needed.

I believe this is a must-have, as long as alertmanager reload/restart has a chance of being triggered as part of the update-status hook.

from alertmanager-k8s-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.