ketama consistent hashing update
Steven Grimm
sgrimm at facebook.com
Wed Aug 8 12:25:46 UTC 2007
Alex Stapleton wrote:
> As Steve says in his follow-up we also need to handle collisions while
> creating the continuum. Simply incrementing the point value by 1 until
> we find an empty place is going to give us a rather biased continuum.
> I think that incrementing the point number 1 and recalculating the
> point value from that should give a better distribution while being
> fairly easy to actually implement.
I'm not a big fan of that idea because an implementer would have to be
very careful to avoid insertion order bugs. You need the mapping from a
key to a host to be the same whether the host list was initialized as
(A,B,C,D) or as (A,B,D) with (C) added later, and any kind of "Oops,
that bucket is taken already, do something else with the node I'm trying
to insert" algorithm will, unless it's carefully implemented,
potentially give you a subtly different data structure in those two cases.
I'm much more in favor of making "n" large enough that it's reasonable
to say something like, "Oops, that bucket is taken already; overwrite it
with me if the address of the node it currently points to is numerically
less than mine, else don't insert myself here." Then you will
deterministically end up with exactly the same search tree no matter
what order the nodes are inserted. If you are inserting each node 10
times in a tree and your search space is a full 32 bits, the occasional
node that's inserted 9 times instead is going to be basically irrelevant
to the health of the system. Not perfectly balanced, sure, but there
will almost never be any collisions to begin with and it isn't worth
optimizing such a rare edge case.
-Steve
More information about the memcached
mailing list