Why not let the fs handle this ??
sgrimm at facebook.com
Wed Jun 7 16:59:24 UTC 2006
Jon Drukman wrote:
> Has this actually worked out well in practice for anybody? I've found
> that losing one machine (out of about 100) results in so much db
> thrashing as the keys get repopulated into different places that the
> site becomes basically unusable until enough of the cache has been
> regenerated (5-10 minutes if i'm lucky).
The problem there is not with lots of memcached servers, but that you're
letting your client relocate your keys when a server goes down. Our
approach is just to let the missing machine cause cache misses, but the
others continue to serve up their usual data. (A missing memcached never
stays missing for long since our operations team monitors the servers
and fixes any broken ones pretty quickly.)
The 1.1.13 version of memcached looks like it has some new features to
deal with server loss without relocating all the keys while still giving
all the keys a place to live, so that'll definitely be an improvement.
But you can get a reasonable failure mode right now by just controlling
what the client does when a server is down. Some of the memcached
clients have an option you can set to control that behavior; others
you'll have to hack.
More information about the memcached