memcached(8) loosing fd events...

Fri Aug 19 09:07:27 PDT 2005

Maybe you could provide an example of proper setup and use of custom
error handlers too ;)  For some reason I can't seem to get it right...
I'm sure it's a brain dead mistake, if you look at the latest php-mcache
beta it crashes during error handling if the server is down and you try
a set.  Pretty irritating, but I haven't really had time to trace
through everything yet.

John A. McCaskey
Software Development Engineer
Klir Technologies, Inc.
johnm at klir.com
206.902.2027

-----Original Message-----
From: Sean Chittenden [mailto:sean at gigave.com] 
Sent: Friday, August 19, 2005 9:05 AM
To: John McCaskey
Cc: memcached at lists.danga.com
Subject: Re: memcached(8) loosing fd events...

> I believe we've been experiencing this.  We actually pulled
> memcached/libmemcache/php-mcache out of production last week because
> sometimes the server seemed to just hang on requests.  I always
> thought it was libmemcache's fault but after many hours and many
> attempted fixes that did nothing I gave up on it and we just removed
> it for the moment.

Yeah... I was getting ready to throw the towel on v1 of my library
because I was having problems chasing bugs like these down.  Given
memcache(3) was the new kid on the block compared to memcached(8), I
didn't think it'd be on the server.  Now, in 1.3.X it was very
possible for memcache(3) to have problems, but in 1.4, everything is
all new, and much cleaner.  I'm copying data into buffers, however, so
the lib is a tad slower, but I don't know that anyone's going to
complain about getting 25Kr/s on an amd64 that's in production.

I'm defining "new code" as:

% diff -ur libmemcache-1.3.0.rc2 libmemcache-1.4.0.b2 | egrep -r '^[-+]'
| wc -l
    4619

Very new, but much better.  It's probably too late now, but I'd be
curious to know if renicing your memcached(8) procs to +20 solved your
problems while using 1.4.  I bet it does, or comes very close to
resolving it 99.99% of the time.  The more busy the higher the chance
that a nice +20 will fix the server hangs.  What's neat is that in
renicing memcached(8) to +20, I only lost ~100r/s out of 10K on my
amd32 desktop.  I haven't done any profiling yet, but:

amd32:
Value size:     10
Num tests:      10000
Test    Ops per second  Total Time      Time per Request
set     9730.058974     1.027743        0.000103
get     10069.479408    0.993100        0.000099
add     9490.808626     1.053651        0.000105
delete  11943.036493    0.837308        0.000084

amd64:
Value size:     10
Num tests:      10000
Test    Ops per second  Total Time      Time per Request
set     25755.804070    0.388262        0.000039
get     21686.328938    0.461120        0.000046
add     17181.599195    0.582018        0.000058
delete  22730.992344    0.439928        0.000044

I'm actively soliciting feedback on the 1.4 branch.  I'm tinkering
with the idea of using mmap(2) for the buffers so I can get some
zero-copy-socket action.  It's a lesser priority, but I'm also going
to probably switch to using kqueue(2) instead of select(2), but
select(2) is out of the call path and only used when the server
blocks.  -sc

-- 
Sean Chittenden