Atomic replace and append operations

Fri Mar 18 06:40:54 PST 2005

I'd second these additions - I've been waiting with a similar kind of 
need for some time, and would be very happy to see this added.

On 18 Mar 2005, at 13:55, Richard Cameron wrote:

>
> I'm interested in implementing two extra storage commands "areplace" 
> and "append" for memcached. The former would behave exactly like 
> "replace", but would return the object's value before it performs the 
> replace operation; the latter would simply string append the new value 
> onto the old one.
>
> The motivation is not purely to add extra bloat to memcached but 
> rather to provide the minimal extra operations I'd need to be able to 
> keep track of dependency trees in the cached data.
>
> For instance, if I want to cache objects A and B (the results of two 
> SQL query, say) which depends on some "thing" in the database (like a 
> particular user U's account), I'd like to be able keep track of my 
> dependencies like this:
>
> U -> A,B
>
> Memcached itself seems like a good choice to store this list. So, when 
> user U does something to his account, I'll want to delete the newly 
> defunct objects from the cache. I think the best I can do at the 
> moment is:
>
> my_deps := [get U]
> foreach dep $my_deps:
> 	[delete $dep]
> [set U ""]
>
> Also, if I perform another cacheable operation C which depends on U, 
> I'd like to be able to add this dependency to my list. The existing 
> protocol lets me say:
>
> old_deps := "get U"
> new_deps := append(old_deps, C)
> [set U $new_deps]
>
> The trouble, of course, is that there are race conditions in both 
> cases where multi-threaded code could execute in a way which would 
> corrupt the dependency list.
>
> On a single machine, I can simply guard the critical section in the 
> client code with a mutex, but this won't work on a distributed server 
> farm. It would be possible to use memcached's existing "incr" and 
> "decr" operations to produce a "global" mutex, but this is a) not 
> terribly efficient, and b) dangerous if one server incr's the 
> semaphore on entry to the critical section but crashes before it 
> exists - this would break things such that no machine could 
> subsequently update the dependency and we'd be stuck in that state for 
> good.
>
> So, my proposed solution is to implement an "areplace" (atomic 
> replace) operation which looks like this:
>
> my_deps = [areplace U ""]
>
> and and append operation which looks like this:
>
> [append U " C"]
>
> I could then keep most of the bloat in the client library, not the 
> server. This approach also has the advantage of being able to track 
> dependencies over multiple memcached instances.
>
> This is probably bending memcached's intended use slightly, but I 
> think it still adheres to the basic principle that it's a cache: all 
> the data can be regenerated from the database if required. The 
> advantage is that I can immediately tell when data should be marked as 
> being invalid, which means I can use much longer cache times which, in 
> turn, means that the cached items in the "long tail" (object which get 
> accessed infrequently but are expensive to compute) can have much 
> higher hit rates.
>
> Seem at all sensible? I'd be interested in any comments before I go 
> away and code this up.
>
> Richard.
>