mogile2.0 still doesn't like big file ?

Brad Fitzpatrick brad at danga.com
Wed Oct 4 20:41:45 UTC 2006


It's kinda fucked up that sysread fails at all, but at least at 5MB I
might understand why....

If the limit's really 8 kB, that's pathetic.  That might be where Perl
stops using its slab cache and uses malloc directly?

Are you running this w/ a memory ulimit set?

It's not fixed in subversion or CPAN yet.  Ideally I'd understand what's
going on before band-aiding it with an arbitrary (and really low!) limit,
since this seemingly doesn't affect everybody ... just some people.

Why?


On Wed, 4 Oct 2006 komtanoo.pinpimai at livetext.com wrote:

> Great thanks, I forgot to mention that I needed to modify Danga::Socket
> (from CPAN), to limit reading from 5M to 8k, otherwise mogstored will
> crash with "Out of Memory" runtime error.
>
>     # max 8k, or perl quits(!!)
>     my $req_bytes = $bytes > 8192 ? 8192 : $bytes;
>
> I believe this issue had been addressed long time ago, but the
> Danga::Socket  in CPAN is still using 5M, I wonder if it's fixed somewhere
> in the SVN and the CPAN version is just outdate ?
>
> -kem
>
> On Tue, October 3, 2006 7:04 pm, Brad Fitzpatrick wrote:
> > I think I know the reason! ....
> >
> >
> > Perlbal bug I think:
> >
> >
> > The idle time isn't being reset when you're doing a PUT.  When you're
> > downloading, we reset the idle time whenever there's network activity, but
> >  during a PUT/upload, there's no call to the function which sets the last
> >  seen activity time.
> >
> > Should be an easy fix.... I've filed this at:
> >
> >
> > http://rt.livejournal.org/Ticket/Display.html?id=2869
> >
> >
> >
> >
> > On Tue, 3 Oct 2006 komtanoo.pinpimai at livetext.com wrote:
> >
> >
> >> Hi All,
> >>
> >>
> >> I think I've found the reason why mogilefs2 fails to replicate big
> >> file. It occurs mostly when network congests or encounters very huge
> >> files, I'm testing my cluster on a 10/100 hub, so it's easy to trigger
> >> the error.
> >>
> >> The problem is mogstored kills any PUT connection taking longer than 30
> >>  seconds, I looked into Perlbal/ClientHTTPBase.pm, I found ----
> >> sub max_idle_time { 30; } ----
> >> , after I change it to something like 5000;, the problem has gone, it
> >> can replicate 700M without a failure.
> >>
> >> My question is,
> >>
> >>
> >> 1. When max_idle_time is 30, why I can download hugh file taking longer
> >>  than 30 secs, but I can't PUT things longer than 30 secs ?
> >>
> >> 2. should max_idle_time of PUT operation be separated from general
> >> max_idle_time or just make it forever ?
> >>
> >> thanks -kem
> >>
> >>
> >> On Mon, October 2, 2006 9:48 pm, komtanoo.pinpimai at livetext.com wrote:
> >>
> >>>> Do you have a patch which fixes this?
> >>>>
> >>>>
> >>>
> >>> I'll try to patch it tomorrow, I'd found a problem in replicator of
> >>> CVS
> >>> version, I think it's the same one, I'll will let you know.
> >>>
> >>>> What server are you using for your storage nodes?  >mogstored
> >>>> (which
> >>>>
> >>>>
> >>> version?), or something else?
> >>>
> >>> It's original mogstored in revision 421, I didn't do anything fancy.
> >>>
> >>>
> >>>
> >>> -kem
> >>>
> >>>
> >>>
> >>> On Mon, October 2, 2006 7:45 pm, Brad Fitzpatrick wrote:
> >>>
> >>>
> >>>> Do you have a patch which fixes this?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> What server are you using for your storage nodes?  mogstored (which
> >>>>  version?), or something else?
> >>>>
> >>>>
> >>>> On Mon, 2 Oct 2006 komtanoo.pinpimai at livetext.com wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> I worked on the mogile1 on CVS for a while and it doesn't like
> >>>>> big file, especially more than 200M. As in the current svn, 421,
> >>>>> the replicator seems to have the same problem.
> >>>>>
> >>>>> -----------------------------------------------------------------
> >>>>> ----
> >>>>> --
> >>>>> ------
> >>>>> [monitor(4333)] Monitor running; scanning usage files
> >>>>> [monitor(4333)] Monitor running; scanning usage files
> >>>>> [replicate(4321)] Error: wrote 720896; expected to write 1048576;
> >>>>> failed putting to /dev1/0/000/000/0000000015.fid [replicate(4321)]
> >>>>>  Failed
> >>>>> copying fid 15 from devid 5 to devid 1 (error type: dest_error)
> >>>>> -----------------------------------------------------------------
> >>>>> ----
> >>>>> ---
> >>>>> -----
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> And it keeps trying without success. One side effect when this
> >>>>> happens is client can't connect this tracker, even the "mogadm
> >>>>> check" shows REQUEST FAILURE.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> I could apply a quick fix patch that I wrote for the CVS version
> >>>>> to it, but it would be nice if mogile2 could get rid of this
> >>>>> problem. I wonder if nobody has big file replicating problem on
> >>>>> mogilefs2 ?
> >>>>>
> >>>>> thanks -kem
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
>
>


More information about the mogilefs mailing list