Handling failovers (rewrite)

Ben Manes ben_manes at yahoo.com
Wed Jul 4 05:28:35 UTC 2007

Thanks Dustin.

Having walked through the code with a colleague, I think I understand how the failover works and why data replication isn't a concern.  Obviously its not critical because the data is always on the database, but my concern was putting this recovery load on it.  After thinking it through, that's not a big deal.  It only seems like one now since we are overloading our db instances and haven't taken advantage of remote caching.

My first concern was that if a bucket goes down, then once its removed from rotation your modulus would be affected.  It see that its not, and instead a gradual degradation algorithm is used.  A key mapping to a failed bucket is re-encoded to point to a live bucket.  While this would return a cache miss, its trivial for the element to be reloaded on demand to the new bucket.

The next aspect is in regards to the impact on the database.  If you assume no partitioning, then the entire reloading of that data (on-demand) impacts only one database.  However, when partitioned, since the remote caching layer is commonly placed outside of them each database would handle only 1/P of the load.  As you scale to greater number of partitions and reduce the database load by caching, this the hit is a non-issue.

Thus, since this is used for read-only data, a replication policy probably doesn't buy you much.

Let me know if that sounds about right. :-)


----- Original Message ----
From: Dustin Sallings <dustin at spy.net>
To: Ben Manes <ben_manes at yahoo.com>
Cc: memcached at lists.danga.com
Sent: Tuesday, July 3, 2007 9:35:37 PM
Subject: Re: Handling failovers (rewrite)

On Jul 3, 2007, at 21:03 , Ben Manes wrote:

I'm using the standard Java client which, like most clients, supports auto-failover support.  I don't quite understand how this works since my understanding is that the server (bucket) is chosen by hashing the keys and modding by the server list.  It also doesn't seem like dynamically updating the server list is supported, which other threads have touched on.  Could someone explain the built-in failover support and how robust it is?

	There are two commonly employed strategies:

	1)  Since the server is chosen by computing a hash of the key modulus the number of servers, you must maintain a consistent ordering of servers on each client's list, so you can just walk the list when a server fails.

	2)  Using a yet-to-be-completely-standardized consistent hashing mechanism.

	Either way, it's up to the client, so as long as you're not trying to read and write data built using different clients, it really doesn't matter which strategy your client employs as long as you've proven it works for you.

	Consistent hashing is conceptually a little more complicated, but causes less stress on your data source when you're growing or shrinking your clusters.

Dustin Sallings


Pinpoint customers who are looking for what you sell. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070703/f68c3f78/attachment.htm

More information about the memcached mailing list