ketama consistent hashing update
Alex Stapleton
alexs at advfn.com
Wed Aug 8 12:03:37 UTC 2007
Brad Fitzpatrick wrote:
> -- which digest algorithm(s) for the points and/or array lookup?
>
There are a seemingly infinite array of hash functions out there but we
already have code for CRC32 on a vast number of platforms, is there any
particular reason – be it speed or quality – that we shouldn't continue
to use that?
> -- how many points are put on the continuum for each target?
> +
> -- what is the input to the digest algorithm for points 1-n?
> e.g.:
> "127.0.0.1:11211-1"
> "127.0.0.1:11211-2"
> "127.0.0.1:11211-3"
> ....
> "127.0.0.1:11211-<n>"
>
>
>
It seems like running multiple hashes over the string seems like a bit
of a waste of cycles. Modifying the hash value for each point would be
much more efficient and simpler to implement. Something like point value
= <hash value> + (<point number> * 2654435761)
As Steve says in his follow-up we also need to handle collisions while
creating the continuum. Simply incrementing the point value by 1 until
we find an empty place is going to give us a rather biased continuum. I
think that incrementing the point number 1 and recalculating the point
value from that should give a better distribution while being fairly
easy to actually implement.
More information about the memcached
mailing list