Better server selection client code?
Andrew Harbick
aharbick at aharbick.com
Fri Oct 20 16:41:51 UTC 2006
> Ah, I gave up on reading your code since I don't know any of Ruby's
> magical syntax.
You poor thing ;)
>
> This approach seems like it would make redundancy slightly trickier to
> implement. With the circular method outlined in the paper redundant
> storage can be achieved by storing keys on the N servers following the
> first matching server and then removing servers from the circle as they
> fail.
I wasn't going for redundancy. The default client doesn't either. My
goal was simply to get the client to choose a server for a given key
such that adding servers didn't cause the client to choose a different
server for EVERY key upon the addition of that new server.
Basically, I can add memcache servers to my fleet and not cause the
cache to effectively be flushed.
>
> Of course I have no idea if redundancy is even desirable for memcached
> and your approach is O(1) (ceil() is constant time right?) rather than
> O(log n) which is a nice advantage I suppose. :)
>
> Alex Stapleton
> alexs at advfn.com
> T: 0207 0700 956
>
>
> On 20 Oct 2006, at 16:48, Andrew Harbick wrote:
>
>> I didn't follow the paper exactly. I don't hash the position of the
>> servers. I just hash the position of key and map the key onto the
>> unit circle then I multiple that number times the number of servers,
>> ceil that and subtract 1.
>>
>> Here's the code again:
>> # Get the hash value for the key and map it onto the unit
>> # circle.
>> key_unit_circle = (key.hash & 0xffffffff).to_f / 0xffffffff
>>
>> # Figure out which server handles this key.
>> server_num = (@buckets.size * key_unit_circle).ceil - 1
>>
>> ...
>>
>> This, I think, should evenly distribute keys on the unit circle (at
>> least as evenly as the hash function) and the servers are at fixed
>> intervals around the unit circle.
>>
>> Alex Stapleton wrote:
>>> On 20 Oct 2006, at 15:26, Andrew Harbick wrote:
>>>>> This exactly was my concern - load is not balanced with this
>>>>> consistent hashing, unless I decide to keep multiple copies of each
>>>>> entry - which ads a new dimension to choice of positions for a new
>>>>> server to be added!
>>>>
>>>> My solution does balance load (assuming your key hash is evenly
>>>> distributed). Servers are placed around the unit circle at even
>>>> intervals.
>>> They are placed at intervals based on their hash value which isn't
>>> strictly an even distribution, especially with a small number of
>>> entries in the circle ;)
>>> Of course as anyone who has read that paper already knows the easiest
>>> way to combat this is enter each server into the circle at multiple
>>> positions. I expect you can just enter some N copies evenly
>>> distributed around the circle and will quickly start to get a fairly
>>> even distribution. Precisely what you chose for N is something to
>>> experiment with though.
>>> Oh, and how on earth do you handle collisions on the server hashes?
>>> We only have 4billion positions to play with here...
>
More information about the memcached
mailing list