Better server selection client code?

Fri Oct 20 15:48:50 UTC 2006

I didn't follow the paper exactly.  I don't hash the position of the 
servers.  I just hash the position of key and map the key onto the unit 
circle then I multiple that number times the number of servers, ceil 
that and subtract 1.

Here's the code again:
     #  Get the hash value for the key and map it onto the unit
     #  circle.
     key_unit_circle = (key.hash & 0xffffffff).to_f / 0xffffffff

     #  Figure out which server handles this key.
     server_num = (@buckets.size * key_unit_circle).ceil - 1

     ...

This, I think, should evenly distribute keys on the unit circle (at 
least as evenly as the hash function) and the servers are at fixed 
intervals around the unit circle.

Alex Stapleton wrote:
> On 20 Oct 2006, at 15:26, Andrew Harbick wrote:
> 
>>> This exactly was my concern - load is not balanced with this
>>> consistent hashing, unless I decide to keep multiple copies of each
>>> entry - which ads a new dimension to choice of positions for a new
>>> server to be added!
>>
>> My solution does balance load (assuming your key hash is evenly 
>> distributed).  Servers are placed around the unit circle at even 
>> intervals.
> 
> They are placed at intervals based on their hash value which isn't 
> strictly an even distribution, especially with a small number of entries 
> in the circle ;)
> 
> Of course as anyone who has read that paper already knows the easiest 
> way to combat this is enter each server into the circle at multiple 
> positions. I expect you can just enter some N copies evenly distributed 
> around the circle and will quickly start to get a fairly even 
> distribution. Precisely what you chose for N is something to experiment 
> with though.
> 
> Oh, and how on earth do you handle collisions on the server hashes? We 
> only have 4billion positions to play with here...