PECL memcache extension
Don MacAskill
don at smugmug.com
Sat Feb 4 19:07:24 UTC 2006
Sounds like we're on the same page as far as understanding the problem.
And I'd definitely like a flag to be able to automatically flush_all()
the server which just re-joined the cluster (or even no option, though I
might be missing a scenario where you wouldn't want this).
But rather than having to do a flush_all() on every member of the
cluster when #2 happens, I'd much rather see something like a php.ini
parameter that lets me tell memcache not to rebalance the cluster when
one fails:
memcache.rebalance = false
I have enough memcache servers that a failure of one of them doesn't
dramatically affect performance. But having stale data, or having to
flush_all() every server would be a Big Deal.
I suppose I could just write a wrapper for memcache in PHP that handles
failure scenarios and not use memcache:addServer() at all if this
doesn't sound feasible.
Also, I'd love to get a little insight into exactly what happens when a
failure occurs. What causes memcache to consider a server to be a
failure? Is it only if a socket connect fails? Or does a failure of
some of the commands (delete, for example) also cause a server to be
marked as failed?
And finally, I see that there's a retry timer. Is that global for the
entire Apache process? Or just a thread/fork? If I set it to be 60
seconds or something, does that mean there will only be a single retry
every 60 seconds for the entire physical server running Apache? Or are
all the threads/forks going to retry every 60 seconds? I want to make
sure we're not retrying so frequently that we're causing it to flap.
A little bit better documentation in this regard would help, but perhaps
providing some mechanism where the application can mark a server in the
cluster as failed at will would be nice, too. And is there any way to
notify (via php's error log, or firing a function or something) me when
a server fails?
Thanks,
Don
Mikael Johansson wrote:
> We've had thoughts about this before, specifically allowing the client
> to provide a callback that get run on successful reconnect of a failed
> backend.
>
> 1) At the very least the reconnected backend should be flushed to ensure
> it doesn't have any stale data laying around from before the failure.
> 2) If the host should fail once more, load would again be distributed
> around the cluster but now these other servers might now have stale data
> from having handled the previous failure.
>
> If failures are infrequent (say, more than the maxium ttl of 1 day),
> only 1) would be necessary. 2) would require flushing the entire cluster
> on failback. Since the callback would be run in all the processes on all
> the frontends one would also have to take precautions not have them all
> execute the flush. For example by add()ing a "last_failback_" +
> (string)floor(time()/60) key on the failed host, if the add() succeed
> someother frontend has already performed the failback procedures.
>
> Suggested API method
> Memcache::setReconnectedCallback(fpCallback) : bool
> function myCallback(Memcache pool, string reconnected_host, string
> reconnected_port) : void
>
> Any thoughts?
>
> //Mikael
>
> ----- Original Message ----- From: "Don MacAskill" <don at smugmug.com>
> To: "memcached mail list" <memcached at lists.danga.com>
> Cc: "Antony Dovgal" <antony at zend.com>; <mikl at php.net>
> Sent: Friday, February 03, 2006 11:37 PM
> Subject: PECL memcache extension
>
>
>>
>> So I was excited to see the PECL memcache extension had gotten
>> multiple server support. But after installing it and playing with it,
>> it doesn't seem to work the way I'd hoped (and the way I currently use
>> memcache).
>>
>> Maybe I'm just being dumb, and if so, I'd love to hear how I can get
>> smart. Until my dumb-ness has been established, though, let me
>> describe how it looks like the PECL extension is working:
>>
>> - using 'addServer', you can add multiple servers and assign each a
>> different bucket weight (Example: 3 servers, each with 1 bucket = 3
>> total buckets). Good.
>>
>> - When a server fails, that server's buckets are removed from the pool
>> and all future get/set/etc commands are reallocated to the remaining
>> pool. (Example: ServerB fails, so the "set key1" that was going to
>> ServerB instead now goes to ServerA)
>>
>> And now, how my memcached setup works:
>>
>> - When a server fails, I mark those buckets as "offline" and no longer
>> permit get/set to that portion of the memcache cluster. All of that
>> data must be served from it's original source (filesystem, MySQL,
>> whatever) until that server comes back up, in which case it's flushed
>> and marked as "active". Yes, I know it's slower during a failure, but
>> since we have lots of memcache servers, it's only a % of the entire
>> cluster slower. And failures are rare to boot.
>>
>> - Note that the "bucket weight" calculation still takes the "offline"
>> buckets into account, so keys are destined for dead servers rather
>> than re-allocated.
>>
>> And now why:
>>
>> If "set key1" is destined to ServerB's buckets, but ServerB fails, I
>> don't want "key1" being redirected to ServerA instead. Why? Because
>> when ServerB comes online, I now have "key1" in two places, and one of
>> them will now potentially get out of date. Should ServerB fail again
>> before "key1" has expired, calls to "get key1" will return old stale
>> data from ServerA instead of fresh data or no result.
>>
>> Make sense? Am I doing something wrong? Can the PECL extension work
>> in this fashion?
>>
>> Don
>>
>>
>>
>>
>>
More information about the memcached
mailing list