ketama - a consistent hashing algo for memcache clients

Steven Grimm sgrimm at
Wed Apr 11 06:13:11 UTC 2007

Only if you remove a server from the list when it's down. If you leave 
it in there and live with the cache misses (i.e., the same approach you 
take when you don't rehash keys in the non-consistent-hashing memcached 
client) then that's not an issue; you'll only move keys around when you 
administratively add / remove servers. And if you do a flush_all on a 
machine before you add it, there will be no stale data to worry about.


tbone wrote:
> Which could happen if the issue was intermittent (network hiccup, 
> semi-bad network cable, or even capped bandwidth on the memcached 
> machine [which I have seen happen])
> Ian Holsman wrote:
>> the old entry would only exist as long as your expiry time.
>> so steps 3-5 would have to occur in your expiry window for this to 
>> take effect.
>> regards
>> Ian
>> On 11/04/2007, at 2:18 PM, Timo Ewalds wrote:
>>> I'd like to point out that the problem of stale cache entries exists 
>>> if you add AND remove servers without flushing all the data. Here's 
>>> the sequence that would cause it:
>>> 1. Start a bunch of servers
>>> 2. Run for a while
>>> 3. Add 1 server
>>> 4. Run for a while
>>> 5. Remove the server added in step 3.
>>> Now a key that mapped to one of the original servers, but moved to 
>>> the new server, maps to the old server again. If the key hadn't 
>>> expired on the old server, it'll be returned as if it was valid, 
>>> though it may have been changed/invalidated on the new server. To 
>>> avoid this, you can never remove servers after adding servers. 
>>> Adding is fine, removing is fine, adding fresh/flushed ones after 
>>> removing is fine, but removing after adding is not.
>>> Note that this problem is very similar to the one of rehashing key 
>>> locations when a server goes down and comes back up without being 
>>> flushed (ie intermittent network issues). I believe the consensus to 
>>> fix it is just to not rehash and deal with the extra db load from 
>>> having a downed memcache server.
>>> As a separate issue, presumably if you're adding servers often, 
>>> you've got many servers. If you're adding 100 buckets per server (to 
>>> make the distribution even), what's the cpu usage like to figure out 
>>> which server to choose, compared to the simple hash method. I'd 
>>> guess in php that could be quite a bit heavier where the bucket list 
>>> needs to be rebuilt each page.
>>> Has anyone tried using consistent hashing not to make server 
>>> management easier, but to increase locality of keys to make 
>>> get_multi's more likely to use fewer servers? Something like have 
>>> the top 20 bits be a hash of the prefix, and the lower 12 bits be a 
>>> hash of the suffix (id?). Using consistent hashing, it means keys 
>>> with the same prefix will be nearby in the number line, so likely to 
>>> map to the same server.
>>> Another solution which we already use for a specific case would be 
>>> to have different server pools for different data groups (I believe 
>>> Steven Grimm mentioned doing this), but it is annoying to do in the 
>>> general case.
>>> Timo
>>> Richard Jones wrote:
>>>> Every time we added memcached servers to the pool, the rehashing of 
>>>> all keys would kill our database while the cache filled up again. 
>>>> We implemented a consistent hashing algorithm that our memcached 
>>>> clients are now using to map keys to servers. Now we can add new 
>>>> memcached servers without causing chaos :)
>>>> Here is the blurb and the code:
>>>> Source download includes:
>>>>  libketama, a c library
>>>>  php_ketama, a php extension that uses the library
>>>>  a pure java implementation
>>>> I'm interested in other approaches (or even if this is a concern 
>>>> for most people?). I read about the lots-of-virtual-buckets 
>>>> concept, but didn't find an implementation. Anyone already using 
>>>> consistent hashing for memcache?
>>>> RJ
>> -- 
>> Ian Holsman
>> Ian at
>> -- what's in your garden?

More information about the memcached mailing list