Optimising use and using multiple memcache servers in a pool

Alan Jay alan_jay_uk at yahoo.co.uk
Sat Jan 20 08:40:37 UTC 2007



Thanks Marcus,

> > In my simple implementation I have 7 web servers each running their
> > own copy of "memcached" on each server the local php code selects
> > that local memcached server and checks to see if the article it is
> > loading has been cached.  And loads it if it has.  The Article
> > Management System when someone edits an article resets the expire
> > time from 6hrs to 5 seconds on each of the servers in turn when the
> > article is updated.

To explain further - in this particular application only ONE machine (in fact
different from front end server) can update the records.

> > This seems to be working OK this morning and we will get more stats
> > over time but the pecl-memcahced interface has the ability to run
> > multiple servers by adding and opening them all for selection.  But
> > if (as I am) my script opens a single file it seems like quite a
> > lot of potential overhead and I'm not sure what the advantages to
> > doing it like that are.

> Hm. I think you've also missed what memcached does to some extent
> (see below). I'm not sure where the breakpoint will lie. I'd be
> tempted to try running just a few instances (say 3-4) that are used
> by all servers. That way you lower the overhead of having so many
> connections to make, though of course it's also dependent on how much
> stuff you are putting in the cache. Remember that a single cache/
> retrieve operation will only ever result in a single request to one
> memcached server, no matter how many servers you have.

OK my worry was that each time a users requests an article then you have to
open up ALL the memcached servers which sounds like quite a lot of overhead
for a single transaction.   Though you knowledgeable people may tell me this
is not actually an issue.

I think understand the trade off you are explaining and it makes sense. 

> > At present the code already fails safe in that if it is unable to
> > connect to the local memecached server it just gets the data from
> > mySQL server direct.
> >
> > What advantages are there to using the multiple memcached servers
> > added to the connection pool?

> One server can benefit from getting something from the cache that was
> put there by another server. 

OK so when you add to the pool it adds to ALL the servers.

> By not sharing your servers, you're
> missing out on this big feature. Also, if you're only doing local
> caching, you're far better off doing it in APC as it will be much
> faster. 

We are already using APC but I thought that memcached was doing something
different.  In my context it is the difference between caching the PHP and the
data from the mySQL database (which is what we are trying to protect).

This particular application is slightly unusual in that our goal is to reduce
the activity on our mySQL database for example if an article is read by 10,000
over a couple of hours it is a great improvement for the article to be read
once for each cache 8 times rather than 10,000. (or have I missed something).

However I do see what you are saying and I think I am beginning understanding
what the implications are.  It all comes back to the amount of overhead
imposed on opening up the pool instead of a single server.
 
> Also, by not sharing your caches, you're just asking for
> coherency problems - where different servers end up with different
> versions of the same item, e.g.
> 
> server A fetches object 1, stores in local cache
> server B fetches object 1, stores in local cache
> server A saves a change to object 1, updates its local cache
> 
> Server B now has outdated version of object 1.
> 
> By having all servers use the same list of memcached servers, you
> avoid this problem.

OK - in this context this is not an issue but I can see how it could be an
issue and the advantages of doing it properly.

In this context this application (I think) seems to work fine, but we do have
another application (a forum) that can use memcached if we upgrade to the
latest version.  So I suspect we will have to alter in line with the way this
does things in the future.

As you can tell I am just dipping into this complex issue primarily because
our site was under very heavy load the last few days up from 2.5 million pages
a day to 4 million pages per day and we have been trying to reduce the amount
of access to the database by this application so that our other applications
don't cause the database server to crash.

Re-reading the description on the DANGA website it suggests placing a cache on
every HTML server.  But again my question about loading the entire pool each
time a query is required and the overhead this imposes is a curiosity.

Marcus from your comments you seem to suggest that a good compromise might be
to local 4 memcached servers and use them as a pool as opposed to 8 server in
the pool or 8 servers run individually.

I'd be interested in any other comments or thoughts from some of you who have
been using this tool for some time.

Alan

PS After a very difficult day on Thursday, Friday went off with out any major
issues after implementing this code early Friday morning.  It was our second
busiest day with 4 million page views for the day and a peak hour at 9pm with
324,000 page views.
 
> Marcus
> --
> Marcus Bointon




More information about the memcached mailing list