Better server selection client code?

Fri Oct 20 09:17:14 UTC 2006

Jeetendra Mirchandani wrote:
> On 10/19/06, Andrew Harbick <aharbick at aharbick.com> wrote:
>> > But here you can have a problem of load distribution. According to the
>> > algorithm, if 2 servers map to points very close on the unit 
>> circle, the
>> > load distribution will be screwed badly.
>> >
>> > possible solution - dont use hashing for mapping servers to the unit
>> > circle, instead maintain a map, and pick the longest arc to place the
>> > new server.
>>
>> Agreed.  But I don't think it has to be that fancy.  I just position the
>> servers around the unit circle at even intervals.
>
> But when I add a new server, I dont want to move all the servers to
> maintain the intervals same. That is same as doing a modulo function.

I've gotta stop coding and read this article. But...I was thinking about 
this from a practical standpoint today. The degree of complexity of 
managing cache mapping could become a bottleneck to regular maintenance 
if it was something coded on each client machine (aka, not a real 
management solution).

However,  if I never aim for a perfect cache hit rate to begin with, I 
can move servers in and out of the cache group so long as I already have 
buckets to cover where they were and where they are. If I plan on adding 
some servers, I will add the server to bucket positions that I've left 
unfilled and are current registering as cache-misses. Same thing for 
reboots or maintenance...just leave the bucket empty and assume higher 
cache misses and database load. This is only feasible if you have plenty 
of cache servers and the length of timeout on a cache miss is really 
short. So if you have two memcache servers, and one goes down, then 
that's likely going to cause undo load. But if you have 10, you're more 
likely to tolerate the higher resulting database load from corresponding 
cache misses. However, this stragegy makes it difficult to widely grow 
the number of caches because you'd have to assume a wildly high cache 
miss rate to begin with. If you were to double the number of caches, 
you'd have to have a really bad cache miss rate to begin with to fit it 
into existing bucket mapping.

Jed