Atomic replace and append operations

Fri Mar 18 12:48:12 PST 2005

I'll third that one. I've suggested it before as well.

Brad answered my question a while ago with how he grabs the list of 
comments on livejournal. He answered that he has a list of comment ids 
in a memcache entry. He then grabs the individual ones one at a time 
based on the contents of the first. When a new comment is added, it gets 
added to the list of comment ids. I asked how to avoid the race 
condition, but never got an answer to that. An append operation would 
fix it.

Timo

Gregory Block wrote:

> I'd second these additions - I've been waiting with a similar kind of 
> need for some time, and would be very happy to see this added.
>
> On 18 Mar 2005, at 13:55, Richard Cameron wrote:
>
>>
>> I'm interested in implementing two extra storage commands "areplace" 
>> and "append" for memcached. The former would behave exactly like 
>> "replace", but would return the object's value before it performs the 
>> replace operation; the latter would simply string append the new 
>> value onto the old one.
>>
>> The motivation is not purely to add extra bloat to memcached but 
>> rather to provide the minimal extra operations I'd need to be able to 
>> keep track of dependency trees in the cached data.
>>
>> For instance, if I want to cache objects A and B (the results of two 
>> SQL query, say) which depends on some "thing" in the database (like a 
>> particular user U's account), I'd like to be able keep track of my 
>> dependencies like this:
>>
>> U -> A,B
>>
>> Memcached itself seems like a good choice to store this list. So, 
>> when user U does something to his account, I'll want to delete the 
>> newly defunct objects from the cache. I think the best I can do at 
>> the moment is:
>>
>> my_deps := [get U]
>> foreach dep $my_deps:
>>     [delete $dep]
>> [set U ""]
>>
>> Also, if I perform another cacheable operation C which depends on U, 
>> I'd like to be able to add this dependency to my list. The existing 
>> protocol lets me say:
>>
>> old_deps := "get U"
>> new_deps := append(old_deps, C)
>> [set U $new_deps]
>>
>> The trouble, of course, is that there are race conditions in both 
>> cases where multi-threaded code could execute in a way which would 
>> corrupt the dependency list.
>>
>> On a single machine, I can simply guard the critical section in the 
>> client code with a mutex, but this won't work on a distributed server 
>> farm. It would be possible to use memcached's existing "incr" and 
>> "decr" operations to produce a "global" mutex, but this is a) not 
>> terribly efficient, and b) dangerous if one server incr's the 
>> semaphore on entry to the critical section but crashes before it 
>> exists - this would break things such that no machine could 
>> subsequently update the dependency and we'd be stuck in that state 
>> for good.
>>
>> So, my proposed solution is to implement an "areplace" (atomic 
>> replace) operation which looks like this:
>>
>> my_deps = [areplace U ""]
>>
>> and and append operation which looks like this:
>>
>> [append U " C"]
>>
>> I could then keep most of the bloat in the client library, not the 
>> server. This approach also has the advantage of being able to track 
>> dependencies over multiple memcached instances.
>>
>> This is probably bending memcached's intended use slightly, but I 
>> think it still adheres to the basic principle that it's a cache: all 
>> the data can be regenerated from the database if required. The 
>> advantage is that I can immediately tell when data should be marked 
>> as being invalid, which means I can use much longer cache times 
>> which, in turn, means that the cached items in the "long tail" 
>> (object which get accessed infrequently but are expensive to compute) 
>> can have much higher hit rates.
>>
>> Seem at all sensible? I'd be interested in any comments before I go 
>> away and code this up.
>>
>> Richard.
>>
>
>
>