Analysis of the mogilefsd busywait bug

Brad Fitzpatrick brad at danga.com
Thu Mar 1 22:49:17 UTC 2007


Jared,

I think this is now fixed in svn751...

Index: lib/MogileFS/Connection/Worker.pm
===================================================================
--- lib/MogileFS/Connection/Worker.pm   (revision 750)
+++ lib/MogileFS/Connection/Worker.pm   (revision 751)
@@ -60,6 +60,12 @@
     }
 }

+sub event_write {
+    my $self = shift;
+    my $done = $self->write(undef);
+    $self->watch_write(0) if $done;
+}
+
 sub job {
     my MogileFS::Connection::Worker $self = shift;
     return $self->{job} unless @_;


Let me know?




On Wed, 28 Feb 2007, Jared Klett wrote:

> 	I just checked ProcManager.pm and David's patch is in there
> (svn748).
>
> 	I checked the versions on my dependencies and they match as
> well:
>
> Danga::Socket is up to date (1.56).
> IO::AIO is up to date (2.33).
>
> 	Nathan: how large are the files you're writing? I was just able
> to reproduce by injecting a few files 900 MB in size.
>
> cheers,
>
> - Jared
>
> -----Original Message-----
> From: mogilefs-bounces at lists.danga.com
> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Nathan Schmidt
> Sent: Wednesday, February 28, 2007 5:13 PM
> To: mogilefs at lists.danga.com
> Subject: Re: Analysis of the mogilefsd busywait bug
>
>
> After applying the patch David Weekly suggested in http://
> lists.danga.com/pipermail/mogilefs/2007-February/000762.html we've not
> seen the high CPU usage problem after about a week of 5-20 writes/ sec.
>
> 2.6.19.1.pbwiki-grsec #1 SMP i686 GNU/Linux, IO::AIO v2.33,
> Danga::Socket v1.56
>
> Regards,
> -Nathan / PBwiki
>
>
>
> On Feb 28, 2007, at 1:45 PM, Brad Fitzpatrick wrote:
>
> > I'm still very eager to solve this, but I've been traveling for past
> > 2+ weeks, away from primary dev environments.
> >
> > We need to solve this before we deploy this version of MogileFS into
> > LJ production, so don't worry -- it'll get fixed.  Hopefully this
> > week.
> >
> > - Brad
> >
> >
> > On Wed, 28 Feb 2007, Jared Klett wrote:
> >
> >> 	I hate to be the squeaky wheel, but I'm still seeing the
> mogilefsd
> >> taking 100% CPU aka the "busywait" bug after updating to
> >> svn748 (the latest as of today).
> >>
> >> 	I reinitialized my entire MogileFS environment, injected one
> file,
> >> and mogilefsd on one of the two tracker servers started spinning at
> >> 100% CPU.
> >>
> >> 	I provided some debug info in this thread:
> >>
> >> http://lists.danga.com/pipermail/mogilefs/2007-February/000792.html
> >>
> >> 	I checked lsof and strace output and it's pretty much the same
> story
> >> I laid out in that post.
> >>
> >> 	is there any new info or resolution on this issue?
> >>
> >> cheers,
> >>
> >> - Jared
> >>
> >> -----Original Message-----
> >> From: mogilefs-bounces at lists.danga.com
> >> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Brad
> >> Fitzpatrick
> >> Sent: Wednesday, February 14, 2007 3:23 AM
> >> To: David Weekly
> >> Cc: nathan at pbwiki.com; mogilefs at lists.danga.com; Brett G. Durrett
> >> Subject: Re: Analysis of the mogilefsd busywait bug
> >>
> >> Committed in svn740.
> >>
> >> On Tue, 13 Feb 2007, David Weekly wrote:
>
>


More information about the mogilefs mailing list