PECL memcache extension

Sun Feb 5 16:40:18 UTC 2006

There's now a "memcache.allow_failover" ini directive in CVS which you can 
use to prevent failover and make the client code return false immediatly, 
defaults to true.

Failover may occur at any stage in any of the methods that talk to the 
server (set, get, delete, increment, ..) and as long as there are other 
servers available the client code won't notice (other than a E_NOTICE being 
triggered.) Causes that would trigger a failover might be socket connect 
failures, read/write errors or Memcached server errors (other than 
out-of-memory.)

Each persistent connection struct has its own retry timeout which gets set 
when some failure occur, after it expires the connection will be retried and 
possibly marked failed for another retry_interval seconds. Since each Apache 
child might have a connection struct of their own each child would attempt 
to reconnect every interval seconds when serving a request.

The changes needed to allow a user to specifiy a callback to be run on 
failback was minor; but since each child on every host might run it when 
they reconnect a failed connection struct the results were somewhat 
unreliable. There's also the very real possibility that the child creates a 
completly new struct even though persistent connect was specified (for 
example when the connection pool is exhausted) and thus doesn't run the 
callback at all. In any case; I backed out those changes and would recommend 
using a real service monitor (such as "mon") instead, to flush failed 
servers when they come back online.

//Mikael

----- Original Message ----- 
From: "Don MacAskill" <don at smugmug.com>
To: "Mikael Johansson" <mikael at synd.info>
Cc: "memcached mail list" <memcached at lists.danga.com>; "Antony Dovgal" 
<antony at zend.com>
Sent: Saturday, February 04, 2006 8:07 PM
Subject: Re: PECL memcache extension

>
> Sounds like we're on the same page as far as understanding the problem. 
> And I'd definitely like a flag to be able to automatically flush_all() the 
> server which just re-joined the cluster (or even no option, though I might 
> be missing a scenario where you wouldn't want this).
>
> But rather than having to do a flush_all() on every member of the cluster 
> when #2 happens, I'd much rather see something like a php.ini parameter 
> that lets me tell memcache not to rebalance the cluster when one fails:
>
> memcache.rebalance = false
>
> I have enough memcache servers that a failure of one of them doesn't 
> dramatically affect performance.  But having stale data, or having to 
> flush_all() every server would be a Big Deal.
>
> I suppose I could just write a wrapper for memcache in PHP that handles 
> failure scenarios and not use memcache:addServer() at all if this doesn't 
> sound feasible.
>
> Also, I'd love to get a little insight into exactly what happens when a 
> failure occurs.  What causes memcache to consider a server to be a 
> failure?  Is it only if a socket connect fails?  Or does a failure of some 
> of the commands (delete, for example) also cause a server to be marked as 
> failed?
>
>
> And finally, I see that there's a retry timer.  Is that global for the 
> entire Apache process?  Or just a thread/fork?  If I set it to be 60 
> seconds or something, does that mean there will only be a single retry 
> every 60 seconds for the entire physical server running Apache?  Or are 
> all the threads/forks going to retry every 60 seconds?  I want to make 
> sure we're not retrying so frequently that we're causing it to flap.
>
> A little bit better documentation in this regard would help, but perhaps 
> providing some mechanism where the application can mark a server in the 
> cluster as failed at will would be nice, too.  And is there any way to 
> notify (via php's error log, or firing a function or something) me when a 
> server fails?
>
> Thanks,
>
> Don