Git Product home page Git Product logo

Comments (11)

jzawodn avatar jzawodn commented on August 23, 2024 1

+1

This is pretty much exactly how we do it at craigslist with our sharding setup. We has to a "node name" rather than directly to an IP:PORT pair, so it's possible to move data without losing any keys.

http://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/

from twemproxy.

antirez avatar antirez commented on August 23, 2024

Thanks for the ACK Jeremy! I also did the same when trying to implement Dynamo concepts on top of Redis.

from twemproxy.

manjuraj avatar manjuraj commented on August 23, 2024

I like this idea of using the "node name" (when specified) instead of "host:port" pair as input to consistent hashing. I also believe that this should be fairly easy to implement

Regarding the open problem of priority, we can just use the priority from the "host:port:priority" triplet. For example, for a input like "127.0.0.1:6382:1 server4" we will use "1" as the priority of server4

from twemproxy.

antirez avatar antirez commented on August 23, 2024

@manjuraj doesn't the priority affect the way the hash ring is populated? (more repliacas of the same node if priority is higher)? If not I was addressing a non existing problem (that just changing the priority would change the map).

from twemproxy.

charsyam avatar charsyam commented on August 23, 2024

@manjuraj I have a question. if server->port is 11211, then why don't you attach port number in hash string? is there special issue?

           if (server->port == KETAMA_DEFAULT_PORT) {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s-%u",
                                   server->name.len, server->name.data,
                                   pointer_index - 1);
            } else {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s:%u-%u",
                                   server->name.len, server->name.data,
                                   server->port, pointer_index - 1);
            }

from twemproxy.

manjuraj avatar manjuraj commented on August 23, 2024

@charsyam This code exists for backward compatibility reasons.

When we deployed twemproxy inside twitter for memcached protocol, for a while we would do dual reads - read data through proxy and read data directly from backend server cluster and ensure that we read the same data from both code paths. Since the client was using libmemcached, we had to make sure that we used the same consistent hashing algorithm as that used by libmemcached library to ensure that keys get mapped to the same server.

I guess, we can now update this code to not attach a port number only if the server pool is a memcache server pool

from twemproxy.

manjuraj avatar manjuraj commented on August 23, 2024

@antirez priority refers to the weight of a server. For example, if I am running redis on a server1 with 4G and another redis on server2 with 8G, I would want to give server2 twice the weight given to server1 in order for the keys to distribute evenly across the total cluster memory

So, if a server migrates from "127.0.0.1:6379:X server1" to "1.2.3.4:8888:Y server1", we ensure that we keep the weights X and Y same to keep the key mapping stable

from twemproxy.

antirez avatar antirez commented on August 23, 2024

@manjuraj yes, I and you understand this, but IMHO this is the random user interaction:

"Hey we got this new fast box with BIG RAM! Holy Shit let's move one of our instances there"

- 192.168.1.3:6379:10 server1
+ 192.168.1.5:6379:99 server1

"Look, I updated the priority because this box is so much bigger!"

And the user ends with data shuffled around instances in a way that is very hard to recover.

So back to my proposals, honestly, both ignoring priority and putting it into the name sound wrong to me. For the following reasons:

  • Ignoring priority is a surprising behavior.
  • Forcing it to be part of the name could work but there is a numerical part anyway, like "myserver:1000", users may still think that the numerical part can be changed without problems.

It's probably better just to use warnings inside the documentation to make sure people understand that changing priority OR instance name will result in different mapping of keys.

from twemproxy.

charsyam avatar charsyam commented on August 23, 2024

@antirez @manjuraj it is complicated problem. I also think ignoring priority is good way when redis is true. but it can also cause some misconception because twemproxy also has to support memcache.

like craiglist. some can use like below too.

192.168.1.3:2000:1 server1-1
192.168.1.3:2001:1 server1-2
192.168.1.3:2002:1 server1-3
192.168.1.3:2003:1 server1-4

but, no one can deny that users will easily make a mistake.

from twemproxy.

antirez avatar antirez commented on August 23, 2024

Maybe the ultimate solution is that:

  • If node ejection is false.
  • If redis is true
  • If for every node the user specified a node name

THEN -> Exit with an error if the specified priority is not always "1", with an error message that makes sense, like:
"You are proxying Redis protocol with node ejection disabled and explicit names for all the nodes. In this setup usually a static map between keys and hosts is needed, so all the instances must be configured with priority 1 (otherwise changing the priority may change how keys are mapped to servers)."

Optionally one may support an option to still allow non-1 priority with Redis server in this setup.

Ok I think so far this is absolutely the best option we have.

from twemproxy.

manjuraj avatar manjuraj commented on August 23, 2024

fixed by @charsyam; docs updated: https://github.com/twitter/twemproxy/blob/master/notes/recommendation.md#node-names-for-consistent-hashing

from twemproxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.