ketama consistent hashing update

Fri Aug 10 12:46:53 UTC 2007

Hi,

The NON_BLOCKING_IO branch of pecl/memcache has (among other things) the option
to use consistent hashing, the algorithm uses CRC32 but is otherwise compatible
with Brad's Perl implementation. Currently it uses 100 (will increase to 160 and
make configurable in the future) points per server and 1000 cached servers like
its Perl counterpart.

Some sort of standardization on a hash function and how the servers are mapped
would be much appreciated, I only used CRC32 because it's already available and
generates a nice 32-bit hash, but switching to FNV or something other should be
trivial given that an open source implementation compatible with the PHP license
exists.

//Mikael

On Thu Aug  9 12:20 , Richard Jones  sent:

>Multiple hash functions are required or you lose the benefit of multiple 
>points per server in the first place, for the reasons brad and russ 
>mentioned. 
>
>I'm going to give crc32 a go (for lookups and initial hashing) and see how the 
>distribution looks, it's probably cheaper, and makes sense as it is more 
>widely available. There was no particular reason why I used md5 over any 
>other hash.
>
>160 was chosen as the number of points per server because it showed a decent 
>distribution.. it's "big enough". Once the hashing and collision handling are 
>well defined, this could potentially be tuned per-install, and still maintain 
>compatability between clients. However, i think we can agree on a sane 
>default with a bit more testing.
>
>I recently spotted a consistent hashing implementation in a branch of php/pecl 
>memcache cvs - http://pecl.php.net/package/memcache which iirc uses crc32 
>instead of md5. Need to investigate this some more.
>
>RJ
>
>
>On Wednesday 08 August 2007 13:03:37 Alex Stapleton wrote:
>> Brad Fitzpatrick wrote:
>> >    -- which digest algorithm(s) for the points and/or array lookup?
>>
>> There are a seemingly infinite array of hash functions out there but we
>> already have code for CRC32 on a vast number of platforms, is there any
>> particular reason – be it speed or quality – that we shouldn't continue
>> to use that?
>>
>> >    -- how many points are put on the continuum for each target?
>> >       +
>> >    -- what is the input to the digest algorithm for points 1-n?
>> >          e.g.:
>> >          "127.0.0.1:11211-1"
>> >          "127.0.0.1:11211-2"
>> >          "127.0.0.1:11211-3"
>> >                  ....
>> >          "127.0.0.1:11211-"
>>
>> It seems like running multiple hashes over the string seems like a bit
>> of a waste of cycles. Modifying the hash value for each point would be
>> much more efficient and simpler to implement. Something like point value
>> =  + ( * 2654435761)
>>
>> As Steve says in his follow-up we also need to handle collisions while
>> creating the continuum. Simply incrementing the point value by 1 until
>> we find an empty place is going to give us a rather biased continuum. I
>> think that incrementing the point number 1 and recalculating the point
>> value from that should give a better distribution while being fairly
>> easy to actually implement.
>
>
>
>-- 
>Richard Jones
>Last.fm Ltd. | http://www.last.fm/
>Office: +44 (0) 207 780 7080
>)