memcached 1.2.2 core dump

Chris Goffinet goffinet at yahoo-inc.com
Sat Nov 17 04:04:36 UTC 2007


Send the core dumps. You have --enable-threads correct?

     pthread_mutex_lock(&cache_lock);
     ret = do_item_stats_sizes(bytes);
     pthread_mutex_unlock(&cache_lock);

It's wrapped in pthread mutex, you do know its going to lock the  
entire cache by calling this correct?

I remember reading sometime ago that by doing this (dump script) it  
would lock entire cache.

-Chris

On Nov 16, 2007, at 7:52 PM, Jeremy LaTrasse wrote:

> We're running memcached 1.2.2 with libevent 1.3 on OpenSolaris 11.
>
> We recently changed some of the ways that our application interacts  
> with memcached, and then very suddenly afterward started  
> experiencing core dumps across our 32 instances of memcached,  
> seemingly arbitrarily.
>
> We captured the core files and discovered that they were all  
> generated when a request for 'stats sizes' was issued by our  
> monitoring processes.
>
> One of the engineers here postulates the following:
>
> SEGFAULT on line 341 of items.c:
>
>     /* build the histogram */
>     memset(histogram, 0, (size_t)num_buckets * sizeof(int ));
>     for (i = 0; i < LARGEST_ID; i++) {
>         item *iter = heads[i];
>         while (iter) {
>             int ntotal = ITEM_ntotal(iter);
>             int bucket = ntotal / 32;
>             if ((ntotal % 32) != 0) bucket++;
>             if (bucket < num_buckets) histogram[bucket]++;
>             iter = iter->next;
>         }
>     }
>
> That's:
>
>             int ntotal = ITEM_ntotal(iter);
>
> Given the huge amount of transactions we're doing, we're probably  
> hitting a race condition around moving items from one bucket to the  
> other.  Perhaps a mutex lock is not being set properly
>
> For the time being we've disabled the 'stats sizes' request from our  
> monitoring processes to preclude this situation.
>
> I could not find this to be a known issue in previous messages on  
> this list, but I am certain that someone will end up in this scenario.
>
> I can send the core files or gdb output to anyone interested in  
> addressing this.
>
> Jeremy LaTrasse
> Operations
> Twitter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20071116/001534a3/attachment-0001.html


More information about the memcached mailing list