Optimising use and using multiple memcache servers in a pool

Sun Jan 21 17:18:46 UTC 2007

Alan Jay wrote:
> Thanks for the insight, as you can probably get we implemented this in a bit
> of a hurry to deal with a very exceptional peak (and it seems to have worked).
>
> The more I read the list the more I understand and the more I wonder :)
>
>   
> [...]
>> a hidden gotcha that comes up on the list occasionally is what happens
>> when a server goes down. Do you leave the server in the pool, but just
>> not failover or do you failover or remove it from the pool and have all
>> of your keys rehash. If you failover or rehash, then you may end up with
>> two copies of the data which might be stale.
>>     
>
> Yes it is an issue, I read somewhere that you can set a server to remain in
> the pool but be down but I'm not sure how it effects the issues.
>
>   
I think that leaving a server down is the best practice for short-term 
outages. If you flush the cache of the down server before it comes up, 
then you won't have stale data.

> The other issue that I wonder about is in an environment where there are a
> relatively small number of items of data but lots of views.  Using a pool
> places all the hits onto a single memcahced server rather than distributing
> them around.
>
> And again I don't know if this would be a significant issue in our context
> where although a lot of articles might be looked at in a day the reality is
> that a large proportion of the views will be for a small number of pages.
>
>   
I suspect that even a small handful of high-traffic objects will still 
be distributed somewhat evenly over the memcache servers, assuming the 
number of objects is greater than the number of memcache servers. 
Besides, high-traffic objects are what memcache was designed for. :)

> I think this next week or so is going to need a little thinking about the
> implementation as there seem to be lots of ways this could be coded with
> different implications.  I'm even thinking that one might be best served
> running a number of copies of "memcached" on each server some as a pool to
> provide depth and size for the "long tail" and a smaller cache on each server
> for small elements and current articles that are looked at most in any one day
> and where distributing the calls across the servers is advantageous.
>
>   
I think that may be more complex than you need. If you have different 
types of data like user data vs articles, then running multiple pools 
might make sense.

In the case of similar data like articles, just use one pool. Set an 
expiration time on each object and do some testing to size the pool 
appropriately. Let memcache manage the heavily used objects. By design, 
lesser used objects will fall out of the cache. Heavily used objects 
will stay in the cache more.

If you have single caches on each server for lesser used objects, that's 
duplicated storage that could be used for more frequently used objects.

If you want lesser-used objects to be cached less, then you could give 
them a shorter expiration time than heavily used objects, but even 
without that, frequently used object should push them out of the cache 
or they will expire.

Start with one big pool for all objects. Do some testing to see if you 
have enough cache to save your frequently used objects and adjust the 
size accordingly.

> All I can say is it is great to have all the options and fantastic that this
> seems to work as well as it does.
>
> Thanks for all the input and comments.
>
> Regards
> Alan
> www.digtialspy.co.uk 
>   

It's nice to have options, but it can also be bewildering.
Glad to help.

Jason