CAS is broken

Tomash Brechko tomash.brechko at gmail.com
Tue Nov 13 08:07:58 UTC 2007


On Sun, Nov 11, 2007 at 22:46:23 -0800, Chris Goffinet wrote:
> The mutex part is handled right now using the threads code, every item  
> allocation has a mutex around it, so this should be safe by calling  
> inside the item_allocation.

Alright, we have agreed on this part.  But now I think there's another
race with CAS, orthogonal to the one Dustin and you are solving:


> diff --git a/memcached.c b/memcached.c
> --- a/memcached.c
> +++ b/memcached.c
> @@ -1255,35 +1254,30 @@ static void process_update_command(conn 
>      it = item_alloc(key, nkey, flags, realtime(exptime), vlen+2);

Here 'it' gets the new CAS.


>      /* HANDLE_CAS VALIDATION */
> +    if (handle_cas == true) {
> +      	item *itmp=item_get(key, it->nkey);
> +      	if(itmp && itmp->cas_id == req_cas_id) {

Here we are testing the old CAS value against what user supplied.  But
what if two threads both will call process_update_command()?  They
both may succeed with this check, because here old CAS is still
returned, we replace the item (and thus set the new CAS value for the
key) much later.  The problem is that testing of CAS for a given key
and actually setting the new CAS is not atomic.  AFAIU, we effectively
set new CAS on do_item_replace() call, and so we should somehow make
this test-and-set atomic (and huge mutex over the whole time of the
call would kill performance).  One option might be to move CAS test to
do_store_item() close to do_item_replace(), and protect both calls
with a mutex (or better to use spinlocks since we never do conditional
waits).

Or am I mistaken somewhere?


-- 
   Tomash Brechko


More information about the memcached mailing list