Busy loop and blocking threads on 1.2.5
Matti Savolainen
matti.savolainen at dynamoid.com
Thu Apr 3 09:20:46 UTC 2008
Hi,
This morning we switched to 1.2.5 memcached from an ancient version and
running on 2 machines both X86_64 with 7000MB for memcached. The servers
worked fine until one of them reached about 2.8GB~ of allocated memory.
It went into a busy loop and other 3 threads were waiting for a mutex.
Didn't accept anymore connections etc. Heres some info on the
systems,versions and GDB stuff.
Running RHEL with 2.6.18. libevent is 1.4.2-rc, memcached 1.2.5 release.
Started with:
-m 7000 -c 8192 -d -P /var/run/memcached.pid -p 11211 -u memcached
Attached GDB and this was the state it was in:
Program terminated with signal 5, Trace/breakpoint trap.
#0 0x0000003f5000c888 in __lll_mutex_lock_wait () from
/lib64/libpthread.so.0
(gdb) info threads
4 process 2258 0x0000003f5000c888 in __lll_mutex_lock_wait () from
/lib64/libpthread.so.0
3 process 2261 0x0000003f5000c888 in __lll_mutex_lock_wait () from
/lib64/libpthread.so.0
2 process 2260 assoc_find (key=0x1bab05c4
"image_commentstream_76966976", nkey=28) at assoc.c:506
* 1 process 2259 0x0000003f5000c888 in __lll_mutex_lock_wait () from
/lib64/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (process 2260)]#0 assoc_find (key=0x1bab05c4
"image_commentstream_76966976", nkey=28) at assoc.c:506
506 it = it->h_next;
(gdb) l
501 while (it) {
502 if ((nkey == it->nkey) &&
503 (memcmp(key, ITEM_key(it), nkey) == 0)) {
504 return it;
505 }
506 it = it->h_next;
507 }
508 return 0;
509 }
510
(gdb) print *it
$1 = {next = 0x2aab1571a300, prev = 0x2aaac57ba6f0, h_next =
0x2aaac57ba688, time = 1211888, exptime = 1380104, nbytes = 8, refcount = 0,
nsuffix = 6 '\006', it_flags = 1 '\001', slabs_clsid = 1 '\001', nkey
= 14 '\016', cas_id = 1643955, end = 0x2aaac57ba6c0}
(gdb) print it
$2 = (item *) 0x2aaac57ba688
(gdb) bt
#0 assoc_find (key=0x1bab05c4 "image_commentstream_76966976", nkey=28)
at assoc.c:506
#1 0x0000000000407bce in do_item_get_notedeleted (key=0x3f583623 "",
nkey=3830567580, delete_locked=0x11bac5) at items.c:416
#2 0x0000000000408f84 in mt_item_get_notedeleted (key=0x1bab05c4
"image_commentstream_76966976", nkey=28, delete_locked=0x0) at thread.c:426
#3 0x0000000000403955 in process_get_command (c=0x1b2e58e0,
tokens=0x41400f20, ntokens=1161925, return_cas=false) at memcached.c:1241
#4 0x0000000000405c57 in try_read_command (c=0x1b2e58e0) at
memcached.c:1671
#5 0x0000000000406679 in event_handler (fd=<value optimized out>,
which=-11620, arg=0x1b2e58e0) at memcached.c:2135
#6 0x00002aaaaaac3848 in event_base_loop (base=0x1aad8250, flags=0) at
event.c:366
#7 0x0000000000408b34 in worker_libevent (arg=0x1aa9fd40) at thread.c:324
#8 0x0000003f500062f7 in start_thread () from /lib64/libpthread.so.0
#9 0x0000003f4f4ce85d in clone () from /lib64/libc.so.6
Heres what the other threads are waiting:
(gdb) thread 1
[Switching to thread 1 (process 2259)]#0 0x0000003f5000c888 in
__lll_mutex_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x0000003f5000c888 in __lll_mutex_lock_wait () from
/lib64/libpthread.so.0
#1 0x0000003f500088b5 in _L_mutex_lock_107 () from /lib64/libpthread.so.0
#2 0x0000003f50008343 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x0000000000408f76 in mt_item_get_notedeleted (key=0x2aaabb696fe4
"image_61698941", nkey=0, delete_locked=0x3f5000f818) at thread.c:425
#4 0x0000000000403955 in process_get_command (c=0x1b2b6470,
tokens=0x409fff20, ntokens=271925180440, return_cas=false) at
memcached.c:1241
#5 0x0000000000405c57 in try_read_command (c=0x1b2b6470) at
memcached.c:1671
#6 0x0000000000406679 in event_handler (fd=<value optimized out>,
which=0, arg=0x1b2b6470) at memcached.c:2135
#7 0x00002aaaaaac3848 in event_base_loop (base=0x1aa9ff50, flags=0) at
event.c:366
#8 0x0000000000408b34 in worker_libevent (arg=0x1aa9fc40) at thread.c:324
#9 0x0000003f500062f7 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003f4f4ce85d in clone () from /lib64/libc.so.6
As you can see the it->h_next is pointing to the same place as it thus
its looping forever. I made a core of it so I can get additional info as
needed.
Cheers
--
Matti Savolainen
Software Developer
Sulake Dynamoid oy
More information about the memcached
mailing list