Git Product home page Git Product logo

Comments (8)

solsson avatar solsson commented on August 28, 2024

The source of the zookeeper probe is here: 2c4b6cd

I logged #65 based on #63 and my own observations of this failing.

What does your TCP probe look like? The one I came up with for Kafka also pollutes the logs.

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

Yes, I used that version of the probe, with it I could not get a zookeeper instance ready for more than a minute. It was failing without any indication I could see. Running nc -w 127.0.0.1 2181 ; echo $? always outputs 0 when attaching, even when kubernetes said nope, so no idea why it doesn't work.

yes, it floods the logs with WARN in zookeeper (see below), but nothing in Kafka.

[2017-10-13 13:44:50,035] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
        at java.lang.Thread.run(Thread.java:748)
[2017-10-13 13:45:00,034] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
        at java.lang.Thread.run(Thread.java:748)

If it annoys you, I guess you could decrease the periodInterval in the probe or reconfigure the log level. We configure error and warnings rates in our monitoring and we can just ignore this individually so it would not trigger, so I don't see any issue there.

This is my TCP probe for Kafka:

        readinessProbe:
          tcpSocket:
            port: 9092
          timeoutSeconds: 10

Zookeeper same, just different ports. Timeout can also be lower I think if it's TCP based as it should always accept. From experience we set it a bit higher to not have it fail on random events like Garbage Collection, etc.

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

Addendum: Yes, I saw #65, did you try to run the command via exec when Kubernetes reported it as failing? Because for me that works, so I guess it's an issue with the probe, not the actual command. + Mine fails consistently every minute or so.

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

Didn't post this, so this is when I run the probe inside and I get very rarely back imok. However, the zookeeper process is OK the whole time.

imokroot@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

Actually this works:

root@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka#

-q makes it wait for a response after EOF

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

Created #74 as a suggestion to fix it.

from kubernetes-kafka.

solsson avatar solsson commented on August 28, 2024

It was actually github that closed this :) Please reopen if you think the TCP based probe is still preferrable.

from kubernetes-kafka.

elm- avatar elm- commented on August 28, 2024

For Kafka maybe, but for Zookeeper this makes more sense as it waits until the node is actually synchronized.

from kubernetes-kafka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.