The problem/use-case that the feature addresses Cur

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

There was some discussion of cluster V2 in <a class="issue-link js-issue-link" data-er

[NEW] Deterministic CLUSTER SHARDS Command about valkey HOT 13 OPEN

valkey-io commented on May 25, 2024 3

[NEW] Deterministic CLUSTER SHARDS Command

from valkey.

Comments (13)

PingXie commented on May 25, 2024 2

So I think it'd fine that clients can get the reply, then remove some fields (which we can document), then compute a hash of the rest of it.

I think the proposal is to have the server remove volatile fields, as opposed to clients doing the filtering.

from valkey.

PingXie commented on May 25, 2024 2

I'd think a binary filtering would be acceptable, like "topology or all". There is a straight story here IMO. Fine-grained filtering, agreed.

from valkey.

daniel-house commented on May 25, 2024 1

I like the word deterministic. If the output is not sorted it can't be either deterministic or canonical, so I like it being sorted.

Clearly you feel that the fields are not presented in a way that is easy for the client to filter. Perhaps deterministic or canonical, or one other keyword, could change the format to something that would be trivial to filter - maybe even JSON.

from valkey.

madolson commented on May 25, 2024

Adding a filter option to the CLUSTER SHARDS command. This enhancement would enable clients to selectively filter out irrelevant fields for their specific use cases, such as non-deterministic fields, nodes in 'fail' state that irrelevant for the client functionality, or other fields that the client doesn't need. It would enable hash-based comparison of views without the need for parsing the output, while also potentially reducing the size of each node's output.

I'm generally against a lot of complexity like what we would need for filtering. So I would prefer we just add CLUSTER SHARDS DETERMINISTIC (I don't like the word, maybe canonical?) and allow clients to optionally call that and we uphold that type of contract on the server.

Sorting the replica order in the output.

I agree with this.

from valkey.

madolson commented on May 25, 2024

Clearly you feel that the fields are not presented in a way that is easy for the client to filter. Perhaps deterministic or canonical, or one other keyword, could change the format to something that would be trivial to filter - maybe even JSON.

Not sure if the format is the concern. Last time I discussed with Bar ( who works at AWS) was that it was simpler and faster to do it without parsing. Maybe there is more to it though.

from valkey.

PingXie commented on May 25, 2024

The way I see it, the problem with CLUSTER SHARDS is that it mixes both "configs", cluster topology in this case, and "states" information in the output, such as replication offset and health. So maybe another word for "deterministic" is "topology", which reflects better the requirement here. "config" would work for me too.

"deterministic" on the other hand is a bit subjective and leaves room for different interpretation, IMO.

from valkey.

zuiderkwast commented on May 25, 2024

~~@madolson Why is this a major decision?~~ Sorry, I'm reading properly. Not.

Sorting the nodes can't be a breaking change (like @hpatro did for CLUSTER SLOTS in #265).

Regarding filtering arguments, I'm not so sure... For hashing the output without parsing, I think that's one of the reasons CLUSTER NODES returns a string rather than RESP-structured data.

For structured RESP reply like this, a client needs to parse the RESP reply to know where the end of the reply is. You'd need a very special parser to find the length of the response without parsing it. So I think it'd fine that clients can get the reply, then remove some fields (which we can document), then compute a hash of the rest of it.

from valkey.

zuiderkwast commented on May 25, 2024

@PingXie I know that's the suggestion, but adding filter arguments to the command is a larger change than to document what clients need to do if they want to compute this kind of hash.

from valkey.

daniel-house commented on May 25, 2024

The problem seems to be

Currently, some clients rely on a single-node perspective to obtain cluster topology, which may not accurately reflect the entire cluster's state.

The rest of the initial post appears to be offering ways to make it easier for a client to get an accurate picture of the entire cluster's state.

I think this problem may be theoretically unsolvable. If so, let us seek a sufficiently useful approximate picture.

Suppose we had some report from each node that was deterministically comparable, as requested by Bar. (Note: there is nothing subjective about the word deterministic. As Bar wrote: nodes with identical views return identical outputs. However, the choice of deterministic output is subjective as it depends on how we will use it.) Would it be sufficient to find a quorum that is reporting exactly the same state? How big would the quorum have to be? Can we determine if the cluster's topology is in flux due to adding or removing nodes, slot-migration or fail-over, so that we might need to try again later?

What is the ultimate use case? Why do we need an accurate picture of the entire cluster's state? Are we presenting a dashboard to the admins? Are we trying to update the client's slot-map for following moved keys? Are we building a replacement for Sentinel? Do we want to give advice for re-sharding? All of the above?

Would the report be more useful for more use cases if the server did some filtering? Would it be more useful if it were trivial to parse using easily available libraries (e.g., JSON)?

Should it be one or more new CLUSTER subcommands such as STATE, TOPOLOGY, or HEALTH?

from valkey.

daniel-house commented on May 25, 2024

There was some discussion of cluster V2 in #58 .

Could someone link information about cluster V2 into this issue? I don't know anything about it except what I gathered from issue 58, but it seems to me that cluster V2 might provide a better solution for the use case that motivates this discussion of CLUSTER SHARDS.

from valkey.

PingXie commented on May 25, 2024

What is the ultimate use case?

cluster v1 is eventual consistent (if ever). whenever there is a topology change, such as adding replicas, scaling out/re-sharding, or performing rolling upgrades, there needs to be a way for the client to detect that the cluster topology quiesces before the client can safely update its view of the slot->node mapping. The signal that @barshaul is looking for here is that "majority of nodes agree on the cluster topology", as reported by CLUSTER SHARDS.

Why do we need an accurate picture of the entire cluster's state?

Request routing is performed by the client based on the key hash. The client computes the crc16 of the key in question to figure out the (logical) slot it belongs to. However, in order to figure out which node hosts the slot, the client relies on this slot->node mapping, which is obtained by one of the cluster topology query commands such as CLUSTER SHARDS.

Are we presenting a dashboard to the admins?

Not the primary use case

Are we trying to update the client's slot-map for following moved keys?

It is a type of cluster topology change.

Are we building a replacement for Sentinel? Do we want to give advice for re-sharding? All of the above?

Orthogonal to sentinel. This is all about clusters.

Would the report be more useful for more use cases if the server did some filtering?

Server filter vs client filtering boils down to implement-it-once (on the server side) vs implement-it-many-times (for every client interested)

Would it be more useful if it were trivial to parse using easily available libraries (e.g., JSON)?

That would be a breaking change.

from valkey.

daniel-house commented on May 25, 2024

I would never suggest replacing what we have today with JSON because that would indeed be a breaking change. Would allowing the client to request JSON instead of the current format be better than having the client specify a filter?

Please help me continue to pin down the requirements. At this point they appear to be to provide a way for the client to

detect that the cluster topology is not in flux, or
determine that a majority (quorum = greater than 50%) of nodes agree on the cluster topology, or
determine the most consistent representation of the current topology of the cluster, or
all of the above.

In all of these cases I think we can do much better by providing a new CLUSTER subcommand.

For example, suppose the goal is (2) determine that we have a quorum (the percentage might be a variable). We could make it much easier to wait for a quorum by associating a unique, non-decreasing ID with each new topology. Then the client only needs to poll until a quorum reports the same maximal ID. Note: I have an intense dislike for polling, and would look for a better solution.

I can think of lots of possible solutions, but I'd rather wait until we have a solid agreement on the goal.

from valkey.

PingXie commented on May 25, 2024

JSON or not is not at the core of this problem. The output of the existing CLUSTER SHARD is already conceptually JSON. The content is actually what is being discussed in this thread.

associating a unique, non-decreasing ID with each new topology

Can you elaborate how this ID would be maintained? There is the epoch that has this monotonically increasing property but it is a per shard concept and it is not reported by CLUSTER SHARDS

I can think of lots of possible solutions, but I'd rather wait until we have a solid agreement on the goal.

Yes and I would say no.2 and 3 are my understanding of the goal. that said, these are unfortunately band-aid requirements/solutions for cluster v1 IMO. I say we really need the strong consistency for the topology (#58 (comment))

from valkey.

[NEW] Deterministic CLUSTER SHARDS Command about valkey HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent