ketama consistent hashing update

Wed Aug 8 18:58:53 UTC 2007

On Wed, 8 Aug 2007, Alex Stapleton wrote:

> Brad Fitzpatrick wrote:

> >    -- what is the input to the digest algorithm for points 1-n?
> >          e.g.:
> >          "127.0.0.1:11211-1"
> >          "127.0.0.1:11211-2"
> >          "127.0.0.1:11211-3"
> >                  ....
> >          "127.0.0.1:11211-<n>"
>
> It seems like running multiple hashes over the string seems like a bit
> of a waste of cycles. Modifying the hash value for each point would be
> much more efficient and simpler to implement. Something like point value
> = <hash value> + (<point number> * 2654435761)

Alex,

I don't think that works well.  Consider:

     item A with weight 1.
     item B with weight 1.

When hashing item A onto the 32-bit continuum you end up with, say, 10.
When hashing item B onto the 32-bit continuum you end up with, say, 11.

Now, because of this one unlucky distribution where A only has 1 unit of
space between it and B, you're not saved by the multiple points.

Now item A's second point is at 2654435761+10, and item B is at
2654435761+11, etc...

So A overall ends up owning n/2**32 of the space, and B ends up owning
(2**32-n)/32 of the space.  Even though they have the same weight!

The whole point of multiple points is to cancel out the unfairness you'd
get if you just did one point.

That said, I'm no expert in this, but your proposal doesn't seem like
it'd work.

- Brad