Distributed session store with planned failover ...

Thu Jun 29 15:44:03 UTC 2006

Memcache users:

	We know that memcached is a well done system -- does a great job at  
solving what it sets out to do. One thing I've seen mentioned on this  
list every once in a while is that for big sites when running  
multiple memcached instances, when one goes down, either planned or  
unplanned, there's a corresponding crunch on both the DB and the  
remaining memcached instances as requested keys get re-hashed by  
clients and correspond to failed GETs on the remaining boxes, leading  
to subsequent DB fetches + memcache STOREs. If you were running the  
memcached's because your DB backend just couldn't take it, this  
crunch could really do it to you.

	Unplanned failures happen, oh, say, every other blue moon. But OS  
vendors are _always_ sending out new kernel and libc updates, so if  
you wanted to stay clean inside your datacenter as much as possible,  
you're gonna need to reboot boxes every other month or so.

	I propose an extension to the behavior of memcache client code along  
the lines of raid-5 to mitigate a loading crunch on either the  
remaining memcache boxes or the database box. Imagine there are three  
memcache servers. Instead of the client's hash routing hashing a key  
to a single memcache server out of the three, it hashes to an ordered  
pair, say, (server1, server3). FETCH requests would go out first to  
server 1, then, if that was a miss, then to server3. If server3  
returned the data, then the client [ either immediately or 'after the  
fact in request post-processing' ] is duty-bound to store it on  
server1. STORE requests for this key would be written to both server1  
and server3.

	The other aspect we thought of to top this off would be to have the  
client put an upper-limit on the expiration time of a stored element  
-- say, 23 hours. This is to ensure that, when a server goes down and  
comes back up [ say, server 3 did in the above scenario ], then any  
data hashing to (server1, server3) would eventually result in a MISS  
on server 1, forcing a deep DB read and STORES on both server 1 and  
server 3. In the case of needing to reboot all memcache server  
machines, but doing only one per day, these deep DB reads + pair of  
STOREs would trickle in over a 24 hour period instead of the current  
scenario of all at once.

	So, in recap -- have at least 3 servers, client code hashes to an  
ordered pair of servers, STORES happen to both in the pair, FETCHes  
served successfully by the first server of the pair cause no  
subsequent action, but FETCHES having to be served by the second in  
the pair force a STORE back to the first.

	Multi-key GETs would be a little more complicated. Okay -- much more  
complicated now that I think more about it -- especially when they  
result in incomplete results. But I do believe it'd be possible.

	From an academic standpoint, does this seem like a workable solution  
for providing both failover and high-availability for memcache w/o  
any server-side complication? Have I missed something making this  
untenable?

	We ourselves don't see he DB load to warrant any of this [ we could  
flush memcached at any time and not come anywhere near crunching our  
DB ], but who's to say about the future, you know?

	We, and probably the majority of memcache sites, have either a SPOF  
in the DB machine [ or at least a real hassle ]. We'd like to not  
also have one w/memcached.

----
James Robinson
Socialserve.com