Analysis of the mogilefsd busywait bug
Brad Fitzpatrick
brad at danga.com
Thu Mar 1 22:49:17 UTC 2007
Jared,
I think this is now fixed in svn751...
Index: lib/MogileFS/Connection/Worker.pm
===================================================================
--- lib/MogileFS/Connection/Worker.pm (revision 750)
+++ lib/MogileFS/Connection/Worker.pm (revision 751)
@@ -60,6 +60,12 @@
}
}
+sub event_write {
+ my $self = shift;
+ my $done = $self->write(undef);
+ $self->watch_write(0) if $done;
+}
+
sub job {
my MogileFS::Connection::Worker $self = shift;
return $self->{job} unless @_;
Let me know?
On Wed, 28 Feb 2007, Jared Klett wrote:
> I just checked ProcManager.pm and David's patch is in there
> (svn748).
>
> I checked the versions on my dependencies and they match as
> well:
>
> Danga::Socket is up to date (1.56).
> IO::AIO is up to date (2.33).
>
> Nathan: how large are the files you're writing? I was just able
> to reproduce by injecting a few files 900 MB in size.
>
> cheers,
>
> - Jared
>
> -----Original Message-----
> From: mogilefs-bounces at lists.danga.com
> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Nathan Schmidt
> Sent: Wednesday, February 28, 2007 5:13 PM
> To: mogilefs at lists.danga.com
> Subject: Re: Analysis of the mogilefsd busywait bug
>
>
> After applying the patch David Weekly suggested in http://
> lists.danga.com/pipermail/mogilefs/2007-February/000762.html we've not
> seen the high CPU usage problem after about a week of 5-20 writes/ sec.
>
> 2.6.19.1.pbwiki-grsec #1 SMP i686 GNU/Linux, IO::AIO v2.33,
> Danga::Socket v1.56
>
> Regards,
> -Nathan / PBwiki
>
>
>
> On Feb 28, 2007, at 1:45 PM, Brad Fitzpatrick wrote:
>
> > I'm still very eager to solve this, but I've been traveling for past
> > 2+ weeks, away from primary dev environments.
> >
> > We need to solve this before we deploy this version of MogileFS into
> > LJ production, so don't worry -- it'll get fixed. Hopefully this
> > week.
> >
> > - Brad
> >
> >
> > On Wed, 28 Feb 2007, Jared Klett wrote:
> >
> >> I hate to be the squeaky wheel, but I'm still seeing the
> mogilefsd
> >> taking 100% CPU aka the "busywait" bug after updating to
> >> svn748 (the latest as of today).
> >>
> >> I reinitialized my entire MogileFS environment, injected one
> file,
> >> and mogilefsd on one of the two tracker servers started spinning at
> >> 100% CPU.
> >>
> >> I provided some debug info in this thread:
> >>
> >> http://lists.danga.com/pipermail/mogilefs/2007-February/000792.html
> >>
> >> I checked lsof and strace output and it's pretty much the same
> story
> >> I laid out in that post.
> >>
> >> is there any new info or resolution on this issue?
> >>
> >> cheers,
> >>
> >> - Jared
> >>
> >> -----Original Message-----
> >> From: mogilefs-bounces at lists.danga.com
> >> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Brad
> >> Fitzpatrick
> >> Sent: Wednesday, February 14, 2007 3:23 AM
> >> To: David Weekly
> >> Cc: nathan at pbwiki.com; mogilefs at lists.danga.com; Brett G. Durrett
> >> Subject: Re: Analysis of the mogilefsd busywait bug
> >>
> >> Committed in svn740.
> >>
> >> On Tue, 13 Feb 2007, David Weekly wrote:
>
>
More information about the mogilefs
mailing list