dormando dormando at
Sat Jun 16 03:48:34 UTC 2007

MogileFS currently doesn't put smaller blobs together into "chunks". 
mogtool is able to turn large files into 64MB chunks, but not the other 
way around :)

If you store a few TB of 10 kilobyte blobs then try to read them all 
back, it'll be pretty slow.

What you can do is fetch/process in parallel. If you have 10,000 small 
files, stored with a mindevcount of at least 2, across 10 drives... You 
could fetch at *least* ten processes in parallel doing storage lookups 
then grabbing the data from the storage nodes. (same thing for storing 


Lamont Granquist wrote:
> Does MogileFS chunk storage like GoogleFS does?  One of the strengths of 
> GoogleFS's design is that it writes out storage in 64MB chunks on the 
> chunkserver.  If you've got thousands of little 10kB files they'll get 
> coalesced into 64MB chunk files.  When a fileserver needs to be 
> replicated the chunkfiles can be replicated which means you can stream 
> 64MB at a time.  A back of the envelope calculation of the total time to 
> stream 64MB is:
> 4 ms + 64 MB / 30 MBs ~= 2 seconds
> versus unchunked seeking every 10kB:
> 6553 * 4 ms + 64 MB / 30 MBs ~= 26 seconds
> So how does MogileFS do this?  If I fill up 2TB with 10kB blobs of data 
> under MogileFS how long does it take to get all that data off the disk? 
> Does it have to do a seek for every 10kB?

