Analysis of the mogilefsd busywait bug

Jared Klett jared at blip.tv
Fri Mar 2 01:15:07 UTC 2007


hi Brad,

	I'm happy to report that after updating to svn752, the bug no
longer rears its ugly head after running my usual battery of tests
(injection of 50 files between 400 and 900 MB in size).

	I'll kick off a test script to run overnight just to be sure,
but so far it looks great. thanks a bunch!

cheers,

- Jared

-----Original Message-----
From: Brad Fitzpatrick [mailto:brad at danga.com] 
Sent: Thursday, March 01, 2007 5:49 PM
To: Jared Klett
Cc: mogilefs at lists.danga.com
Subject: RE: Analysis of the mogilefsd busywait bug

Jared,

I think this is now fixed in svn751...

Index: lib/MogileFS/Connection/Worker.pm
===================================================================
--- lib/MogileFS/Connection/Worker.pm   (revision 750)
+++ lib/MogileFS/Connection/Worker.pm   (revision 751)
@@ -60,6 +60,12 @@
     }
 }

+sub event_write {
+    my $self = shift;
+    my $done = $self->write(undef);
+    $self->watch_write(0) if $done;
+}
+
 sub job {
     my MogileFS::Connection::Worker $self = shift;
     return $self->{job} unless @_;


Let me know?




On Wed, 28 Feb 2007, Jared Klett wrote:

> 	I just checked ProcManager.pm and David's patch is in there
(svn748).
>
> 	I checked the versions on my dependencies and they match as
> well:
>
> Danga::Socket is up to date (1.56).
> IO::AIO is up to date (2.33).
>
> 	Nathan: how large are the files you're writing? I was just able
to 
> reproduce by injecting a few files 900 MB in size.
>
> cheers,
>
> - Jared
>
> -----Original Message-----
> From: mogilefs-bounces at lists.danga.com 
> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Nathan Schmidt
> Sent: Wednesday, February 28, 2007 5:13 PM
> To: mogilefs at lists.danga.com
> Subject: Re: Analysis of the mogilefsd busywait bug
>
>
> After applying the patch David Weekly suggested in http:// 
> lists.danga.com/pipermail/mogilefs/2007-February/000762.html we've not

> seen the high CPU usage problem after about a week of 5-20 writes/
sec.
>
> 2.6.19.1.pbwiki-grsec #1 SMP i686 GNU/Linux, IO::AIO v2.33, 
> Danga::Socket v1.56
>
> Regards,
> -Nathan / PBwiki
>
>
>
> On Feb 28, 2007, at 1:45 PM, Brad Fitzpatrick wrote:
>
> > I'm still very eager to solve this, but I've been traveling for past
> > 2+ weeks, away from primary dev environments.
> >
> > We need to solve this before we deploy this version of MogileFS into

> > LJ production, so don't worry -- it'll get fixed.  Hopefully this 
> > week.
> >
> > - Brad
> >
> >
> > On Wed, 28 Feb 2007, Jared Klett wrote:
> >
> >> 	I hate to be the squeaky wheel, but I'm still seeing the
> mogilefsd
> >> taking 100% CPU aka the "busywait" bug after updating to
> >> svn748 (the latest as of today).
> >>
> >> 	I reinitialized my entire MogileFS environment, injected one
> file,
> >> and mogilefsd on one of the two tracker servers started spinning at

> >> 100% CPU.
> >>
> >> 	I provided some debug info in this thread:
> >>
> >> http://lists.danga.com/pipermail/mogilefs/2007-February/000792.html
> >>
> >> 	I checked lsof and strace output and it's pretty much the same
> story
> >> I laid out in that post.
> >>
> >> 	is there any new info or resolution on this issue?
> >>
> >> cheers,
> >>
> >> - Jared
> >>
> >> -----Original Message-----
> >> From: mogilefs-bounces at lists.danga.com 
> >> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Brad 
> >> Fitzpatrick
> >> Sent: Wednesday, February 14, 2007 3:23 AM
> >> To: David Weekly
> >> Cc: nathan at pbwiki.com; mogilefs at lists.danga.com; Brett G. Durrett
> >> Subject: Re: Analysis of the mogilefsd busywait bug
> >>
> >> Committed in svn740.
> >>
> >> On Tue, 13 Feb 2007, David Weekly wrote:
>
>


More information about the mogilefs mailing list