ketama - a consistent hashing algo for memcache clients

Ian Holsman lists at
Wed Apr 11 04:45:58 UTC 2007

the old entry would only exist as long as your expiry time.

so steps 3-5 would have to occur in your expiry window for this to  
take effect.


On 11/04/2007, at 2:18 PM, Timo Ewalds wrote:

> I'd like to point out that the problem of stale cache entries  
> exists if you add AND remove servers without flushing all the data.  
> Here's the sequence that would cause it:
> 1. Start a bunch of servers
> 2. Run for a while
> 3. Add 1 server
> 4. Run for a while
> 5. Remove the server added in step 3.
> Now a key that mapped to one of the original servers, but moved to  
> the new server, maps to the old server again. If the key hadn't  
> expired on the old server, it'll be returned as if it was valid,  
> though it may have been changed/invalidated on the new server. To  
> avoid this, you can never remove servers after adding servers.  
> Adding is fine, removing is fine, adding fresh/flushed ones after  
> removing is fine, but removing after adding is not.
> Note that this problem is very similar to the one of rehashing key  
> locations when a server goes down and comes back up without being  
> flushed (ie intermittent network issues). I believe the consensus  
> to fix it is just to not rehash and deal with the extra db load  
> from having a downed memcache server.
> As a separate issue, presumably if you're adding servers often,  
> you've got many servers. If you're adding 100 buckets per server  
> (to make the distribution even), what's the cpu usage like to  
> figure out which server to choose, compared to the simple hash  
> method. I'd guess in php that could be quite a bit heavier where  
> the bucket list needs to be rebuilt each page.
> Has anyone tried using consistent hashing not to make server  
> management easier, but to increase locality of keys to make  
> get_multi's more likely to use fewer servers? Something like have  
> the top 20 bits be a hash of the prefix, and the lower 12 bits be a  
> hash of the suffix (id?). Using consistent hashing, it means keys  
> with the same prefix will be nearby in the number line, so likely  
> to map to the same server.
> Another solution which we already use for a specific case would be  
> to have different server pools for different data groups (I believe  
> Steven Grimm mentioned doing this), but it is annoying to do in the  
> general case.
> Timo
> Richard Jones wrote:
>> Every time we added memcached servers to the pool, the rehashing  
>> of all keys would kill our database while the cache filled up  
>> again. We implemented a consistent hashing algorithm that our  
>> memcached clients are now using to map keys to servers. Now we can  
>> add new memcached servers without causing chaos :)
>> Here is the blurb and the code:
>> Source download includes:
>>  libketama, a c library
>>  php_ketama, a php extension that uses the library
>>  a pure java implementation
>> I'm interested in other approaches (or even if this is a concern  
>> for most people?). I read about the lots-of-virtual-buckets  
>> concept, but didn't find an implementation. Anyone already using  
>> consistent hashing for memcache?
>> RJ

Ian Holsman
Ian at -- what's in your garden?

More information about the memcached mailing list