ketama - a consistent hashing algo for memcache clients

Timo Ewalds timo at tzc.com
Wed Apr 11 04:58:56 UTC 2007


This is true, but many people have very long expiry times. We frequently 
have expiry times in the several week range, and I expect many people 
have it set to indefinite.

Timo

Ian Holsman wrote:
>
> the old entry would only exist as long as your expiry time.
>
> so steps 3-5 would have to occur in your expiry window for this to 
> take effect.
>
>
> regards
> Ian
>
> On 11/04/2007, at 2:18 PM, Timo Ewalds wrote:
>
>> I'd like to point out that the problem of stale cache entries exists 
>> if you add AND remove servers without flushing all the data. Here's 
>> the sequence that would cause it:
>>
>> 1. Start a bunch of servers
>> 2. Run for a while
>> 3. Add 1 server
>> 4. Run for a while
>> 5. Remove the server added in step 3.
>>
>> Now a key that mapped to one of the original servers, but moved to 
>> the new server, maps to the old server again. If the key hadn't 
>> expired on the old server, it'll be returned as if it was valid, 
>> though it may have been changed/invalidated on the new server. To 
>> avoid this, you can never remove servers after adding servers. Adding 
>> is fine, removing is fine, adding fresh/flushed ones after removing 
>> is fine, but removing after adding is not.
>>
>> Note that this problem is very similar to the one of rehashing key 
>> locations when a server goes down and comes back up without being 
>> flushed (ie intermittent network issues). I believe the consensus to 
>> fix it is just to not rehash and deal with the extra db load from 
>> having a downed memcache server.
>>
>> As a separate issue, presumably if you're adding servers often, 
>> you've got many servers. If you're adding 100 buckets per server (to 
>> make the distribution even), what's the cpu usage like to figure out 
>> which server to choose, compared to the simple hash method. I'd guess 
>> in php that could be quite a bit heavier where the bucket list needs 
>> to be rebuilt each page.
>>
>> Has anyone tried using consistent hashing not to make server 
>> management easier, but to increase locality of keys to make 
>> get_multi's more likely to use fewer servers? Something like have the 
>> top 20 bits be a hash of the prefix, and the lower 12 bits be a hash 
>> of the suffix (id?). Using consistent hashing, it means keys with the 
>> same prefix will be nearby in the number line, so likely to map to 
>> the same server.
>> Another solution which we already use for a specific case would be to 
>> have different server pools for different data groups (I believe 
>> Steven Grimm mentioned doing this), but it is annoying to do in the 
>> general case.
>>
>> Timo
>>
>>
>> Richard Jones wrote:
>>> Every time we added memcached servers to the pool, the rehashing of 
>>> all keys would kill our database while the cache filled up again. We 
>>> implemented a consistent hashing algorithm that our memcached 
>>> clients are now using to map keys to servers. Now we can add new 
>>> memcached servers without causing chaos :)
>>>
>>> Here is the blurb and the code:
>>> http://www.last.fm/user/RJ/journal/2007/04/10/392555/
>>>
>>> Source download includes:
>>>  libketama, a c library
>>>  php_ketama, a php extension that uses the library
>>>  a pure java implementation
>>>
>>> I'm interested in other approaches (or even if this is a concern for 
>>> most people?). I read about the lots-of-virtual-buckets concept, but 
>>> didn't find an implementation. Anyone already using consistent 
>>> hashing for memcache?
>>>
>>> RJ
>>>
>>>
>
> -- 
> Ian Holsman
> Ian at Holsman.net
> http://garden-gossip.com/ -- what's in your garden?
>
>
>


More information about the memcached mailing list