Better server selection client code?

Fri Oct 20 16:24:18 UTC 2006

Ah, I gave up on reading your code since I don't know any of Ruby's  
magical syntax.

This approach seems like it would make redundancy slightly trickier  
to implement. With the circular method outlined in the paper  
redundant storage can be achieved by storing keys on the N servers  
following the first matching server and then removing servers from  
the circle as they fail.

Of course I have no idea if redundancy is even desirable for  
memcached and your approach is O(1) (ceil() is constant time right?)  
rather than O(log n) which is a nice advantage I suppose. :)

Alex Stapleton
alexs at advfn.com
T: 0207 0700 956

On 20 Oct 2006, at 16:48, Andrew Harbick wrote:

> I didn't follow the paper exactly.  I don't hash the position of  
> the servers.  I just hash the position of key and map the key onto  
> the unit circle then I multiple that number times the number of  
> servers, ceil that and subtract 1.
>
> Here's the code again:
>     #  Get the hash value for the key and map it onto the unit
>     #  circle.
>     key_unit_circle = (key.hash & 0xffffffff).to_f / 0xffffffff
>
>     #  Figure out which server handles this key.
>     server_num = (@buckets.size * key_unit_circle).ceil - 1
>
>     ...
>
> This, I think, should evenly distribute keys on the unit circle (at  
> least as evenly as the hash function) and the servers are at fixed  
> intervals around the unit circle.
>
> Alex Stapleton wrote:
>> On 20 Oct 2006, at 15:26, Andrew Harbick wrote:
>>>> This exactly was my concern - load is not balanced with this
>>>> consistent hashing, unless I decide to keep multiple copies of each
>>>> entry - which ads a new dimension to choice of positions for a new
>>>> server to be added!
>>>
>>> My solution does balance load (assuming your key hash is evenly  
>>> distributed).  Servers are placed around the unit circle at even  
>>> intervals.
>> They are placed at intervals based on their hash value which isn't  
>> strictly an even distribution, especially with a small number of  
>> entries in the circle ;)
>> Of course as anyone who has read that paper already knows the  
>> easiest way to combat this is enter each server into the circle at  
>> multiple positions. I expect you can just enter some N copies  
>> evenly distributed around the circle and will quickly start to get  
>> a fairly even distribution. Precisely what you chose for N is  
>> something to experiment with though.
>> Oh, and how on earth do you handle collisions on the server  
>> hashes? We only have 4billion positions to play with here...