Comments (8)
The source of the zookeeper probe is here: 2c4b6cd
I logged #65 based on #63 and my own observations of this failing.
What does your TCP probe look like? The one I came up with for Kafka also pollutes the logs.
from kubernetes-kafka.
Yes, I used that version of the probe, with it I could not get a zookeeper instance ready for more than a minute. It was failing without any indication I could see. Running nc -w 127.0.0.1 2181 ; echo $?
always outputs 0 when attaching, even when kubernetes said nope, so no idea why it doesn't work.
yes, it floods the logs with WARN in zookeeper (see below), but nothing in Kafka.
[2017-10-13 13:44:50,035] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
[2017-10-13 13:45:00,034] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
If it annoys you, I guess you could decrease the periodInterval in the probe or reconfigure the log level. We configure error and warnings rates in our monitoring and we can just ignore this individually so it would not trigger, so I don't see any issue there.
This is my TCP probe for Kafka:
readinessProbe:
tcpSocket:
port: 9092
timeoutSeconds: 10
Zookeeper same, just different ports. Timeout can also be lower I think if it's TCP based as it should always accept. From experience we set it a bit higher to not have it fail on random events like Garbage Collection, etc.
from kubernetes-kafka.
Addendum: Yes, I saw #65, did you try to run the command via exec when Kubernetes reported it as failing? Because for me that works, so I guess it's an issue with the probe, not the actual command. + Mine fails consistently every minute or so.
from kubernetes-kafka.
Didn't post this, so this is when I run the probe inside and I get very rarely back imok. However, the zookeeper process is OK the whole time.
imokroot@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
root@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -w 1 127.0.0.1 2181
from kubernetes-kafka.
Actually this works:
root@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka# echo ruok | nc -q 1 127.0.0.1 2181
imokroot@zoo-0:/opt/kafka#
-q makes it wait for a response after EOF
from kubernetes-kafka.
Created #74 as a suggestion to fix it.
from kubernetes-kafka.
It was actually github that closed this :) Please reopen if you think the TCP based probe is still preferrable.
from kubernetes-kafka.
For Kafka maybe, but for Zookeeper this makes more sense as it waits until the node is actually synchronized.
from kubernetes-kafka.
Related Issues (20)
- Zookeeper properties file needs an empty line at the end of the file HOT 3
- Run JMX exporter as a Java Agent (how to?) HOT 1
- Pod, Service and Statefull pending
- Error connecting to node kafka-0.broker.kafka.svc.cluster.local:9092 HOT 1
- Error processing /etc/kafka/zookeeper.properties.scale-5.pzoo-0 HOT 5
- Can you tell me about 10 brokers in Kafka- config.yml File parameters log.retention.hours= -1 and log.retention.hours=168 What's the difference?
- Release v6.0.4 Seems to be a Breaking Release? HOT 6
- ZooKeeper produce a zombie processes HOT 4
- Error processing /etc/kafka/zookeeper.properties.scale-5.pzoo-1 HOT 5
- Can't produce/consume with outside brokers HOT 1
- [Question] Getting started but no resources created?
- upstream bug: zookeeper 3.5.7 leader election seriously broken HOT 1
- How do I specify my own volumeclass / volume mount locations?
- Zookeeper Init:Error "/etc/kafka-configmap/init.sh: No such file or directory"
- Issue on external service (Kafka) HOT 1
- Incompatible with newer kustomize/kubectl
- Quickstart is broken (v6.0.3) HOT 1
- Auto scale Kafka partitions HOT 1
- Unable to successfully start pods - CrashLoopBackOff error HOT 1
- ARM64 Images for Kafka JMX Prometheus Exporter
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubernetes-kafka.