Comments (6)
The client internally retries the retriable errors and does not bubble those up. There are a few potential errors that can be returned from a poll, all of which the client can't handle and essentially unrecoverable:
- not authorized
- unsupported compression (you should not receive this)
- unsupported message version (you should also not receive this
- unknown error
I think in these cases, the partition is repeatedly fetched, so these errors should crop up continuously.
If the partition requires loading offsets or listing the broker epoch (for data loss detection), if these fail with non-retriable errors (auth), the client does not try fetching the partition again.
It's an oversight at the moment to continue retrying on auth failures in the fetch response itself.
For step 2, the records should always be processed -- any successfully processed record in the client advances the client's offset. So, skipping records is not necessary and could actually lead to missed records.
Primarily, you should not get any errors from PollFetches
, so any that you do receive should be investigated.
Let me know if this clarifies anything at all, I think this is pointing to my documentation needing to be touched up. I've also made a note to address the fetch loop returning the same auth errors, rather than returning the error once and not fetching anymore.
from franz-go.
@twmb Thanks for clarification. It sounds like the only errors that the consumer will get are non-retriable errors, and there isn't much consumer can do do other than logging the errors and wait for manual intervention
from franz-go.
Yep that's the goal!
Here is the place where errors are inspected when consuming records, note that only the default case does not clear the error, all other cases are for retriable errors and the client clears the error:
Lines 673 to 752 in 8178f2c
This is the place where ListOffsets or OffsetForLeader epoch can have a fatal error:
Lines 893 to 907 in 8178f2c
As mentioned in my prior comment, I'm going to address the source.go code to stop using a partition if there is an auth error, and I'll also add a configuration option to force continuing to use the partition on auth errors (and make the list offsets non-retriable error behave the same as the source.go error).
from franz-go.
Thanks. I'll reopen the issue if I have any further question regarding this.
from franz-go.
I'm going to reopen this issue to keep track of this open inconsistency; I plan to address this soon. There's a need to handle the client retrying on auth failure, allowing an operator to start a client before it is authorized, or to rotate credentials.
from franz-go.
I closed this with the most recent commit (which will be in the next tagged release in a week or so-ish). The new behavior continues to retry loading a partition's offset or epoch even in the face of non-retriable errors: we will back off 1s and retry, rather than permanently failing the partition.
This is a bit different from my prior comment's goals, which were to stop consuming if a non-retriable error occurred. I think it's safer and probably better to just always continue retrying.
from franz-go.
Related Issues (20)
- TransactionalID Partition Suffix HOT 6
- Another kafka NullPointerException on fetch HOT 2
- Feature request: KIP-842 HOT 1
- lz4 decompression causes a lot of allocations due to ioutil.ReadAll HOT 7
- Kotel consumer not setting trace id correctly. HOT 4
- "default commit failed" logs contain no error HOT 4
- Custom number of acks HOT 2
- How to handle TLS certificate renewals from client HOT 3
- Option to set group.instance.id HOT 4
- Question on serde for Protobuf HOT 5
- Allow rebalance question with manual commits HOT 1
- Support for ACL requests in kfake package
- v1.16 release status
- Struggling to catch retryable errors despite KeepRetryableFetchErrors() option HOT 7
- Consumer fetch returns error when coordinator broker can't be reached HOT 1
- Question about fetching timestamp offset HOT 7
- Panic when calling `client.OptValue(kgo.WithLogger)` when logger was not set. HOT 1
- gssapi HOT 1
- Partition assignment to consumer in consumer-group is duplicated after consumer restart HOT 7
- ConsumeResetOffset AtEnd causes `LEADER_NOT_AVAILABLE` error HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from franz-go.