Optimising use and using multiple memcache servers in a pool

Sun Jan 21 08:57:49 UTC 2007

Thanks for the insight, as you can probably get we implemented this in a bit
of a hurry to deal with a very exceptional peak (and it seems to have worked).

The more I read the list the more I understand and the more I wonder :)

> -----Original Message-----
> From: Jason Edgecombe 
> Sent: Sunday, January 21, 2007 12:15 AM
> Subject: Re: Optimising use and using multiple memcache servers in a pool
> 
> You're welcome.
> 
> I've only used memcache in toy applications, not in production. I've
> just been reading the mailing list for a while. ;)

:)

> I honestly don't know about the overhead of one vs multiple servers. It
> depends on the client API implementation.

Yes the more I read the more I come to that conclusion.

> According to
> http://usphp.com/manual/en/function.memcache-connect.php
> 
> The connection to the extra server isn't made until it's used.  The
> pconnect method isn't supposed to have to rebuild the connection.
> 
> I would suggest benchmarking with 1, 2, and 3 memcache servers to see
> what the overhead times look like.
> 
> Theoretically, you should only have to start a connection when you do a
> get. Given the server list and weights, compute which servers to connect
> to get the keys.

Indeed - though the problem is that I suspect because of the type of system
that this is that any difference will be relatively small.  

> a hidden gotcha that comes up on the list occasionally is what happens
> when a server goes down. Do you leave the server in the pool, but just
> not failover or do you failover or remove it from the pool and have all
> of your keys rehash. If you failover or rehash, then you may end up with
> two copies of the data which might be stale.

Yes it is an issue, I read somewhere that you can set a server to remain in
the pool but be down but I'm not sure how it effects the issues.

The other issue that I wonder about is in an environment where there are a
relatively small number of items of data but lots of views.  Using a pool
places all the hits onto a single memcahced server rather than distributing
them around.

And again I don't know if this would be a significant issue in our context
where although a lot of articles might be looked at in a day the reality is
that a large proportion of the views will be for a small number of pages.

I think this next week or so is going to need a little thinking about the
implementation as there seem to be lots of ways this could be coded with
different implications.  I'm even thinking that one might be best served
running a number of copies of "memcached" on each server some as a pool to
provide depth and size for the "long tail" and a smaller cache on each server
for small elements and current articles that are looked at most in any one day
and where distributing the calls across the servers is advantageous.

All I can say is it is great to have all the options and fantastic that this
seems to work as well as it does.

Thanks for all the input and comments.

Regards
Alan
www.digtialspy.co.uk