Better server selection client code?
Andrew Harbick
aharbick at aharbick.com
Fri Oct 20 15:48:50 UTC 2006
I didn't follow the paper exactly. I don't hash the position of the
servers. I just hash the position of key and map the key onto the unit
circle then I multiple that number times the number of servers, ceil
that and subtract 1.
Here's the code again:
# Get the hash value for the key and map it onto the unit
# circle.
key_unit_circle = (key.hash & 0xffffffff).to_f / 0xffffffff
# Figure out which server handles this key.
server_num = (@buckets.size * key_unit_circle).ceil - 1
...
This, I think, should evenly distribute keys on the unit circle (at
least as evenly as the hash function) and the servers are at fixed
intervals around the unit circle.
Alex Stapleton wrote:
> On 20 Oct 2006, at 15:26, Andrew Harbick wrote:
>
>>> This exactly was my concern - load is not balanced with this
>>> consistent hashing, unless I decide to keep multiple copies of each
>>> entry - which ads a new dimension to choice of positions for a new
>>> server to be added!
>>
>> My solution does balance load (assuming your key hash is evenly
>> distributed). Servers are placed around the unit circle at even
>> intervals.
>
> They are placed at intervals based on their hash value which isn't
> strictly an even distribution, especially with a small number of entries
> in the circle ;)
>
> Of course as anyone who has read that paper already knows the easiest
> way to combat this is enter each server into the circle at multiple
> positions. I expect you can just enter some N copies evenly distributed
> around the circle and will quickly start to get a fairly even
> distribution. Precisely what you chose for N is something to experiment
> with though.
>
> Oh, and how on earth do you handle collisions on the server hashes? We
> only have 4billion positions to play with here...
More information about the memcached
mailing list