Why not let the fs handle this ??
Casper Langemeijer
casper at bcx.nl
Thu Jun 8 07:09:58 UTC 2006
I have might something to add here:
We use a special IP range to access the memcached clients.
Every client has it's own IP. If a client is down we let another machine
take it's place by adding the IP to the fail-over. This is a common HA
technique.
In this case it is special because we use an IP to identify a service.
You could use hostnames for that, but we don't want to bother with the
DNS lookups/ttl's and such.
Furthermore, every memcached client is on a different port, enabling us
to temporary have more than one client on a machine in case of total
meltdown. It doesn't matter how much memory is assigned for the temporary
client. As long as it accepts connections...
Not saying this is _the_ solution, it's just a solution.
Grtz!
On Wed, 7 Jun 2006, Don MacAskill wrote:
> > Has this actually worked out well in practice for anybody? I've found that
> > losing one machine (out of about 100) results in so much db thrashing as the
> > keys get repopulated into different places that the site becomes basically
> > unusable until enough of the cache has been regenerated (5-10 minutes if i'm
> > lucky).
[...cut...]
> The key is to keep that downed 100th machine in your pool, so the key
> allocation algorithm still "counts" it, but to somehow let your application
> know not to write to it while it's in a downed state.
>
> In our particular case, any failures to memcache cause a server to be flagged
> as "down" in our tracker. Then, asynchronously, the state of that server is
> periodically checked. When it comes back up, it's completely flushed, and
> then marked as active. (You have to do the flush to get rid of any stale data
> in case the server was just unresponsive, unreachable, or some other non-hard
> restart situation).
More information about the memcached
mailing list