Java client [file chunking]

Andy Lo A Foe andy.loafoe at gmail.com
Tue Oct 30 02:23:17 UTC 2007


Hi,

This is exactly why we've opted to try and fix large file support on the
client side (Ruby mogilefs-client) and figure out a combination of
tracker/storage for the MogileFS setup to support large files instead of
chunked transfers. We frequently have files of about 500MB that need to be
replicated and streamed to clients. The overhead of chunking is simply too
big and we would loose X-Sendfile support. Apache2/WebDAV has worked out
well on storage nodes (mogstored currently chokes on files > 100MB).

Gr,
Andy

On 10/30/07, dormando <dormando at rydia.net> wrote:
>
> I'd suggest testing it.
>
> I've been manually disabling the chunked support, given every use case I
> have requires the files to be streamed from one place to the other. If
> you have a large, contiguous file, you need to re-assemble it in order
> at some point anyway. If you're spoonfeeding clients, you want to start
> and go without having to re-stablish an HTTP session in the middle.
>
> So, if you want to pull data in chunks and can process it in parallel,
> or you want to very evenly fill every storage device, fine... but I
> don't see that happening in any useful way.
>
> It's likely my own lack of imagination here. Someone please prove me
> wrong :)
>
> It feels like a bad default at any rate, since you can't serve the large
> files back like that presently.
>
> -Dormando
>
> Lance Reed wrote:
> > I have yet to deploy a large scale setup of mogilefs but have a small
> > test env setup.
> >
> > The Java client would fit nicely into our application use.
> > Most of my experience so far has been with using mogtool, so I assumed
> > that it was best to chunk files in to 64 MB.
> > But I take it from this thread that this is not the case.  Do people
> > usually just put and take files as in without chunking then?
> > This would save time on the initial write I assume.  But do little fro
> > the read back.  I have not had a chance to test, but do people see a
> > performance increase when reading back large files that have been
> > chunked due to parallel reads?
> > Or is the overhead of re-construction too much to really see a benefit?
> >
> > I'd love to hear what folks have been seeing in the real world.
> > I am planning to use mogilefs with files that range from 30 KB to 20 GB,
> > and am trying to figure out if I really need to put chunking code in.
> >
> > But I could see wanting a patch for the java client to do chunking.
> >
> > Thanks for any thoughts on this.
> >
> > Lance
> >
> >
> >
> > dormando wrote:
> >> Curious now...
> >>
> >> Does anyone use chunked files for anything?
> >>
> >> I can't think of any reason why you'd get more performance out of it,
> >> and the only benefit being able to stuff files larger than the
> >> individual storage nodes into mogile.
> >>
> >> Believe it's still the default for mogtool with files > 64M, which
> >> must be confusing for folks?
> >>
> >> -Dormando
> >>
> >> Jared Klett wrote:
> >>>     We ended up not using the chunked file support for performance
> >>> reasons - we just store the whole file and serve it straight off disk.
> >>>
> >>>     Regardless, I'd be happy to release a patch if there's demand.
> >>>
> >>> cheers,
> >>>
> >>> - Jared
> >>>
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/mogilefs/attachments/20071030/937f67ce/attachment.html


More information about the mogilefs mailing list