Git Product home page Git Product logo

Comments (6)

vkuznet avatar vkuznet commented on August 24, 2024 2

I updated WMArchive configuration to use 1min threshold for recv/send timeouts on production and testbed clusters (FYI: @arooshap , @muhammadimranfarooqi) . Apart from that as I explained earlier is no longer allocated to development on services outside of WM area and further development efforts should be addressed via @klannon

from wmarchive.

yuyiguo avatar yuyiguo commented on August 24, 2024

@vkuznet
What is the average duration of WMarchive connection to AMQ brokers? Do you have monitoring to share?

from wmarchive.

vkuznet avatar vkuznet commented on August 24, 2024

Yuyi, you can find relevant information over here: https://monit-grafana.cern.ch/d/u_qOeVqZk/wmarchive-monit?orgId=11 and https://monit-grafana.cern.ch/d/wma-service/wmarchive-service?orgId=11 The first one contains the latency plot.

from wmarchive.

yuyiguo avatar yuyiguo commented on August 24, 2024

Valentin, Which plots show the WMArchive to AMQ connection duration or disconnection rate?

from wmarchive.

vkuznet avatar vkuznet commented on August 24, 2024

Yuyi, I pointed out to existing dashboard, but it does not have duration of AMQ connection, someone should add this to the code. Said that, it is trivial to see from wmarhive logs (vocms750:/cephfs/product/wma-logs/):

...
2023/05/14 00:05:32 stomp.go:168: send data to 188.185.13.100:61313 endpoint /topic/cms.jobmon.wmarchive
2023/05/14 00:05:32 stomp.go:168: send data to 188.185.11.68:61313 endpoint /topic/cms.jobmon.wmarchive
2023/05/14 00:06:28 stomp.go:168: send data to 188.185.35.176:61313 endpoint /topic/cms.jobmon.wmarchive
2023/05/14 00:06:28 wmarchive.go:298: POST /wmarchive/data/ 10.100.36.192:60508 [WMCore.Services.Requests/v002] [/DC=ch/DC=cern/OU=computers/CN=wmagent/vocms0255.cern.ch] [188.185.89.194] {"result":[{"ids":["c5b8aed966fc4585865d2da2ebfd1b0d"],"status":"ok"}]}
2023/05/14 00:06:31 stomp.go:168: send data to 188.184.92.147:61313 endpoint /topic/cms.jobmon.wmarchive
2023/05/14 00:06:31 stomp.go:168: send data to 188.184.92.147:61313 endpoint /topic/cms.jobmon.wmarchive

So, connection did not last more than a minute since logs shows every time WMArchive sends the data and timestamp in logs shows that usually we have few log entries within a minute.

from wmarchive.

LionelCons avatar LionelCons commented on August 24, 2024

FWIW, I can confirm that the problem is still present. I still see an abnormal number of warnings coming from the cmsweb machines and linked to the small (1.5s) heart-beat threshold.

from wmarchive.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.