Memcache 1.2.5 segfaults

Hugo Hallqvist hugo at dokad.se
Fri Jun 13 12:55:52 UTC 2008


Hi list,

we're using memcache to cache documents in our application and we've
got some issues with stability. Memcached segfaults after having been
run for some time.
We're using memcached version 1.2.5 on linux, kernel version 2.6.22
from ubuntu. We have been getting the crashes on 3 different computers
running 2 different kernel version, so it seems likely this is
memcache-related.

Do anyone recognize these problems? Is there some info we can add in
order to troubleshoot the problem?

This is the stacktrace from gdb:
Core was generated by `/usr/local/bin/memcached -vv -m 1500 -p 11211
-u root -r'.
Program terminated with signal 6, Aborted.
#0  0x00002af105d3b765 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00002af105d3b765 in raise () from /lib/libc.so.6
#1  0x00002af105d3d1c0 in abort () from /lib/libc.so.6
#2  0x00002af105d7460b in ?? () from /lib/libc.so.6
#3  0x00002af105d7bb0a in ?? () from /lib/libc.so.6
#4  0x00002af105d7d73e in ?? () from /lib/libc.so.6
#5  0x00002af105d7f979 in realloc () from /lib/libc.so.6
#6  0x0000000000402532 in do_suffix_add_to_freelist (s=0x660590 " 1001\r\n")
   at memcached.c:596
#7  0x0000000000402628 in conn_cleanup (c=0x656330) at memcached.c:413
#8  0x00000000004026f4 in conn_close (c=0x656330) at memcached.c:459
#9  0x00000000004063fd in event_handler (fd=<value optimized out>, which=2793,
   arg=0x656330) at memcached.c:2309
#10 0x00002af105af6f99 in event_base_loop (base=0x613d80, flags=0)
   at event.c:331
#11 0x00000000004049bf in main (argc=-1524798896, argv=<value optimized out>)
   at memcached.c:3130

As the issue seems memory related I tried running it through valgrind
and got the following errors:
valgrind /usr/local/bin/memcached -vv -c 10 -m 1500 -p 11211 -u root -r

<16 add 19BD1FAA62B46055817FE6FA5E8E9F 2 1800 1532856
>16 SERVER_ERROR object too large for cache
==2889==
==2889== Invalid write of size 8
==2889==    at 0x402558: do_suffix_add_to_freelist (memcached.c:600)
==2889==    by 0x4067DC: event_handler (memcached.c:2274)
==2889==    by 0x4E2CF98: event_base_loop (event.c:331)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==  Address 0x40B23C0 is not stack'd, malloc'd or (recently) free'd
==2889==
==2889== Invalid write of size 8
==2889==    at 0x40250E: do_suffix_add_to_freelist (memcached.c:592)
==2889==    by 0x4067DC: event_handler (memcached.c:2274)
==2889==    by 0x4E2CF98: event_base_loop (event.c:331)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==  Address 0x40B23C8 is not stack'd, malloc'd or (recently) free'd

it doesn't crash here, but a few searches later it crashes

>27 sending key document:7185484416683036457
>27 SERVER_ERROR out of memory making CAS suffix
<16 add 19BD1FAA62B46055817FE6FA5E8E9F 2 1800 1532856
>16 SERVER_ERROR object too large for cache
==2889==
==2889== Invalid read of size 8
==2889==    at 0x4E2C4D2: event_queue_insert (event.c:892)
==2889==    by 0x4E3814C: epoll_dispatch (epoll.c:243)
==2889==    by 0x4E2CE60: event_base_loop (event.c:440)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==  Address 0x7203EE0 is 8 bytes after a block of size 24 alloc'd
==2889==    at 0x4C21C16: malloc (vg_replace_malloc.c:149)
==2889==    by 0x403C86: process_get_command (memcached.c:1274)
==2889==    by 0x405CA7: try_read_command (memcached.c:1692)
==2889==    by 0x4065BB: event_handler (memcached.c:2135)
==2889==    by 0x4E2CF98: event_base_loop (event.c:331)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==
==2889== Invalid read of size 8
==2889==    at 0x4E2C4DF: event_queue_insert (event.c:892)
==2889==    by 0x4E3814C: epoll_dispatch (epoll.c:243)
==2889==    by 0x4E2CE60: event_base_loop (event.c:440)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==  Address 0x39DD5CC0 is not stack'd, malloc'd or (recently) free'd
==2889==
==2889== Process terminating with default action of signal 11
(SIGSEGV): dumping core
==2889==  Access not within mapped region at address 0x39DD5CC0
==2889==    at 0x4E2C4DF: event_queue_insert (event.c:892)
==2889==    by 0x4E3814C: epoll_dispatch (epoll.c:243)
==2889==    by 0x4E2CE60: event_base_loop (event.c:440)
==2889==    by 0x4049BE: main (memcached.c:3130)
==2889==
==2889== ERROR SUMMARY: 60070 errors from 7 contexts (suppressed: 16 from 1)
==2889== malloc/free: in use at exit: 51,446,265 bytes in 10,329 blocks.
==2889== malloc/free: 19,270 allocs, 8,941 frees, 145,234,725 bytes allocated.
==2889== For counts of detected errors, rerun with: -v
==2889== searching for pointers to 10,329 not-freed blocks.
==2889== checked 50,947,848 bytes.
==2889==
==2889== LEAK SUMMARY:
==2889==    definitely lost: 49,989 bytes in 1,639 blocks.
==2889==      possibly lost: 0 bytes in 0 blocks.
==2889==    still reachable: 51,396,276 bytes in 8,690 blocks.
==2889==         suppressed: 0 bytes in 0 blocks.
==2889== Rerun with --leak-check=full to see details of leaked memory.
Segmentation fault

dmesg output:
Linux version 2.6.22-14-server (buildd at king) (gcc version 4.1.3
20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #1 SMP Tue Feb 12
03:10:53 UTC 2008 (Ubuntu 2.6.22-14.52-server)
---- snip ----
[592338.930134] memcached[16615]: segfault at 0000000000000bc1 rip
00002b6ce31060b7 rsp 00007fffc7e48900 error 6

--
Med vänlig hälsning,
Hugo Hallqvist
Dokad Software AB
www.dokad.se


More information about the memcached mailing list