Better server selection client code?

Andrew Harbick aharbick at aharbick.com
Fri Oct 20 16:41:51 UTC 2006


> Ah, I gave up on reading your code since I don't know any of Ruby's 
> magical syntax.

You poor thing ;)

> 
> This approach seems like it would make redundancy slightly trickier to 
> implement. With the circular method outlined in the paper redundant 
> storage can be achieved by storing keys on the N servers following the 
> first matching server and then removing servers from the circle as they 
> fail.

I wasn't going for redundancy.  The default client doesn't either.  My 
goal was simply to get the client to choose a server for a given key 
such that adding servers didn't cause the client to choose a different 
server for EVERY key upon the addition of that new server.

Basically, I can add memcache servers to my fleet and not cause the 
cache to effectively be flushed.

> 
> Of course I have no idea if redundancy is even desirable for memcached 
> and your approach is O(1) (ceil() is constant time right?) rather than 
> O(log n) which is a nice advantage I suppose. :)
> 
> Alex Stapleton
> alexs at advfn.com
> T: 0207 0700 956
> 
> 
> On 20 Oct 2006, at 16:48, Andrew Harbick wrote:
> 
>> I didn't follow the paper exactly.  I don't hash the position of the 
>> servers.  I just hash the position of key and map the key onto the 
>> unit circle then I multiple that number times the number of servers, 
>> ceil that and subtract 1.
>>
>> Here's the code again:
>>     #  Get the hash value for the key and map it onto the unit
>>     #  circle.
>>     key_unit_circle = (key.hash & 0xffffffff).to_f / 0xffffffff
>>
>>     #  Figure out which server handles this key.
>>     server_num = (@buckets.size * key_unit_circle).ceil - 1
>>
>>     ...
>>
>> This, I think, should evenly distribute keys on the unit circle (at 
>> least as evenly as the hash function) and the servers are at fixed 
>> intervals around the unit circle.
>>
>> Alex Stapleton wrote:
>>> On 20 Oct 2006, at 15:26, Andrew Harbick wrote:
>>>>> This exactly was my concern - load is not balanced with this
>>>>> consistent hashing, unless I decide to keep multiple copies of each
>>>>> entry - which ads a new dimension to choice of positions for a new
>>>>> server to be added!
>>>>
>>>> My solution does balance load (assuming your key hash is evenly 
>>>> distributed).  Servers are placed around the unit circle at even 
>>>> intervals.
>>> They are placed at intervals based on their hash value which isn't 
>>> strictly an even distribution, especially with a small number of 
>>> entries in the circle ;)
>>> Of course as anyone who has read that paper already knows the easiest 
>>> way to combat this is enter each server into the circle at multiple 
>>> positions. I expect you can just enter some N copies evenly 
>>> distributed around the circle and will quickly start to get a fairly 
>>> even distribution. Precisely what you chose for N is something to 
>>> experiment with though.
>>> Oh, and how on earth do you handle collisions on the server hashes? 
>>> We only have 4billion positions to play with here...
> 


More information about the memcached mailing list