ketama consistent hashing update

Wed Aug 8 12:03:37 UTC 2007

Brad Fitzpatrick wrote:
>    -- which digest algorithm(s) for the points and/or array lookup?
>   
There are a seemingly infinite array of hash functions out there but we 
already have code for CRC32 on a vast number of platforms, is there any 
particular reason – be it speed or quality – that we shouldn't continue 
to use that?
>    -- how many points are put on the continuum for each target?
>       +
>    -- what is the input to the digest algorithm for points 1-n?
>          e.g.:
>          "127.0.0.1:11211-1"
>          "127.0.0.1:11211-2"
>          "127.0.0.1:11211-3"
>                  ....
>          "127.0.0.1:11211-<n>"
>
>
>   
It seems like running multiple hashes over the string seems like a bit 
of a waste of cycles. Modifying the hash value for each point would be 
much more efficient and simpler to implement. Something like point value 
= <hash value> + (<point number> * 2654435761)

As Steve says in his follow-up we also need to handle collisions while 
creating the continuum. Simply incrementing the point value by 1 until 
we find an empty place is going to give us a rather biased continuum. I 
think that incrementing the point number 1 and recalculating the point 
value from that should give a better distribution while being fairly 
easy to actually implement.