memcached crashing

Tim Yardley liquid@haveheart.com
Wed, 30 Jun 2004 14:40:38 -0500


Just a fyi, if you want to see what libevent method is being used by an
app...

[liquid:~/memcached-1.1.11] tyardley% setenv EVENT_SHOW_METHOD 1
[liquid:~/memcached-1.1.11] tyardley% ./memcached 
libevent using: kqueue
^C

/tmy

-----Original Message-----
From: memcached-admin@lists.danga.com
[mailto:memcached-admin@lists.danga.com] On Behalf Of Jon Valvatne
Sent: Wednesday, June 30, 2004 12:24 AM
To: Brad Fitzpatrick
Cc: memcached@lists.danga.com
Subject: Re: memcached crashing

Ah. Sorry for the false alarm. I didn't realize I had to recompile
memcached as well, in order for it to stop using rtsig. It works very
nicely now, thank you very much for the prompt replies.

By the way: Can you say anything about the performance of poll vs epoll?
How many simultaneous connections and/or operations per second could I
get before I should start worrying about finding a way to get epoll to
work on this box?

Jon

----------------------------------------------------------------------
On Tue, 29 Jun 2004 22:06:31 -0700 (PDT)
Brad Fitzpatrick <brad@danga.com> wrote:

> Looks like you're still using buggy rtsig:
> 
> 
> rt_sigtimedwait([IO 34], {si_signo=SIGRT_2, si_code=0x1, si_pid=65,
> si_uid=13, si_value={int=1, ptr=0x1}}, 0xbfffdc
> 58, 8) = 34
> rt_sigtimedwait([IO 34], {si_signo=SIGRT_2, si_code=0x1, si_pid=65,
> si_uid=4, si_value={int=1, ptr=0x1}}, 0xbfffdc5
> 8, 8) = 34
> fcntl64(4, F_GETFL)                     = -1 EBADF (Bad file
> descriptor)
> exit_group(0)                           = ?
> 
> 
> 
> On Wed, 30 Jun 2004, Jon Valvatne wrote:
> 
> > No core file. I've attached the last part of the strace output; my
> > mailer wasn't being nice with the wrapping.
> >
> > Jon
> >
> > -------------------------------------------------------------------
> > ---
> > On Tue, 29 Jun 2004 21:37:00 -0700 (PDT)
> > Brad Fitzpatrick <brad@danga.com> wrote:
> >
> > > Scary.
> > >
> > > Run it with -r to increase core file size, and make sure the user
> > > you
> > > run
> > > it as has permission to write to the directory you start it from.
> > > (with -r
> > > it won't chdir to /)
> > >
> > > Then with the core file, we can inspect it with gdb.
> > >
> > > But maybe it's not crashing and just quitting, like the event loop
> > > is
> > > ending.
> > >
> > > In that case, run it in the foreground but with strace in front of
> > > it:
> > >
> > > strace ./memcached .....
> > >
> > > Then paste what you see as its final output.
> > >
> > >
> > >
> > > On Wed, 30 Jun 2004, Jon Valvatne wrote:
> > >
> > > > Ok; thanks for the heads-up. I recompiled libevent without rtsig
> > > > support, but that doesn't seem to have changed anything at all.
> > > > Still
> > > > random crashes and refused connections.
> > > >
> > > > Is there any way to get any sort of debug information out of
> > > > memcached
> > > > when it crashes?
> > > >
> > > > Jon
> > > >
> > > > ---------------------------------------------------------------
> > > > ----
> > > > ---
> > > > On Tue, 29 Jun 2004 21:10:53 -0700 (PDT)
> > > > Brad Fitzpatrick <brad@danga.com> wrote:
> > > >
> > > > > Do *not* use libevent's rtsig support.  I thought he removed
> > > > > that
> > > > > given
> > > > > how buggy it was.  Three really smart people worked on it for
> > > > > quite
> > > > > some
> > > > > time without getting it anywhere near reliable.  It's just a
> > > > > crap
> > > > > interface and it was never made to work with libevent.
> > > > >
> > > > > Use poll if you must, but epoll's really the best.
> > > > >
> > > > > - Brad
> > > > >
> > > > >
> > > > > On Wed, 30 Jun 2004, Jon Valvatne wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've been using memcached to add some caching to a
> > > > > > production
> > > > > > system
> > > > > > to
> > > > > > speed things up. Everything worked smoothly on my test box,
> > > > > > but
> > > > > > I
> > > > > > ran
> > > > > > into nothing but problems when trying to go live with the
> > > > > > changes:
> > > > > > Memcached would just die randomly, without any error message
> > > > > > whatsoever,
> > > > > > within minutes of startup. And even while it was running and
> > > > > > accepting
> > > > > > some connections, other connections appeared to be randomly
> > > > > > refused.
> > > > > >
> > > > > > The only difference between the test box and the production
> > > > > > system
> > > > > > is
> > > > > > that one is running Fedora Core 2, and the other Redhat 9.
> > > > > > Before I
> > > > > > try
> > > > > > to debug the situation more, I would like to ask: Does
> > > > > > anyone
> > > > > > here
> > > > > > have
> > > > > > any experience running memcached with Redhat 9? There's
> > > > > > obviously no
> > > > > > epoll support, so I compiled the latest libevent with
> > > > > > --with-rtsig,
> > > > > > and
> > > > > > I'm assuming that's what memcached is using. Is this just
> > > > > > inherently
> > > > > > buggy, or so poor-performing that my system with about a
> > > > > > hundred
> > > > > > connections and several operations per second will cause the
> > > > > > problem
> > > > > > I'm
> > > > > > seeing?
> > > > > >
> > > > > > One thing that worried me were the test results when
> > > > > > compiling
> > > > > > libevent:
> > > > > >
> > > > > > Running tests:
> > > > > > KQUEUE
> > > > > > Skipping test
> > > > > > POLL
> > > > > >  test-eof: OKAY
> > > > > >  test-weof: OKAY
> > > > > >  test-time: OKAY
> > > > > >  regress: FAILED
> > > > > > SELECT
> > > > > >  test-eof: OKAY
> > > > > >  test-weof: OKAY
> > > > > >  test-time: OKAY
> > > > > >  regress: FAILED
> > > > > > RTSIG
> > > > > >  test-eof: OKAY
> > > > > >  test-weof: OKAY
> > > > > >  test-time: OKAY
> > > > > >  regress: FAILED
> > > > > > EPOLL
> > > > > > Skipping test
> > > > > >
> > > > > > What are these regress tests, and what would cause them to
> > > > > > fail?
> > > > > >
> > > > > > By the way: Is there any way to ask memcached or libevent
> > > > > > which
> > > > > > polling
> > > > > > mechanism is being used?
> > > > > >
> > > > > > Thanks in advance,
> > > > > >
> > > > > > Jon Valvatne
> > > > > >
> > > > > >
> > > >
> > > >
> >