memcached(8) loosing fd events...
Sean Chittenden
sean at gigave.com
Fri Aug 19 09:04:59 PDT 2005
> I believe we've been experiencing this. We actually pulled
> memcached/libmemcache/php-mcache out of production last week because
> sometimes the server seemed to just hang on requests. I always
> thought it was libmemcache's fault but after many hours and many
> attempted fixes that did nothing I gave up on it and we just removed
> it for the moment.
Yeah... I was getting ready to throw the towel on v1 of my library
because I was having problems chasing bugs like these down. Given
memcache(3) was the new kid on the block compared to memcached(8), I
didn't think it'd be on the server. Now, in 1.3.X it was very
possible for memcache(3) to have problems, but in 1.4, everything is
all new, and much cleaner. I'm copying data into buffers, however, so
the lib is a tad slower, but I don't know that anyone's going to
complain about getting 25Kr/s on an amd64 that's in production.
I'm defining "new code" as:
% diff -ur libmemcache-1.3.0.rc2 libmemcache-1.4.0.b2 | egrep -r '^[-+]' | wc -l
4619
Very new, but much better. It's probably too late now, but I'd be
curious to know if renicing your memcached(8) procs to +20 solved your
problems while using 1.4. I bet it does, or comes very close to
resolving it 99.99% of the time. The more busy the higher the chance
that a nice +20 will fix the server hangs. What's neat is that in
renicing memcached(8) to +20, I only lost ~100r/s out of 10K on my
amd32 desktop. I haven't done any profiling yet, but:
amd32:
Value size: 10
Num tests: 10000
Test Ops per second Total Time Time per Request
set 9730.058974 1.027743 0.000103
get 10069.479408 0.993100 0.000099
add 9490.808626 1.053651 0.000105
delete 11943.036493 0.837308 0.000084
amd64:
Value size: 10
Num tests: 10000
Test Ops per second Total Time Time per Request
set 25755.804070 0.388262 0.000039
get 21686.328938 0.461120 0.000046
add 17181.599195 0.582018 0.000058
delete 22730.992344 0.439928 0.000044
I'm actively soliciting feedback on the 1.4 branch. I'm tinkering
with the idea of using mmap(2) for the buffers so I can get some
zero-copy-socket action. It's a lesser priority, but I'm also going
to probably switch to using kqueue(2) instead of select(2), but
select(2) is out of the call path and only used when the server
blocks. -sc
--
Sean Chittenden
More information about the memcached
mailing list