Lamont Granquist lamont at
Sat Jun 16 04:06:38 UTC 2007

Yeah, I'm more interested in dealing with large scale data management 
operations.  So when you've got 200 machines with 5TB of data per machine 
and you fail one, how long will it take to rebuild it off of one of the 
replicants?  If you ever need to fsck one of the partitions, how long will 
it take?  If you need to replicate the data in one datacenter to another, 
how long will it take?  The normal day-to-day read/write operations may be 
adequately fast, but bulk operations like this can take days/weeks/months 
when dealing with large amounts of data...

And as the size of disks increases much faster than the seek time on disks 
it becomes more and more important that large amounts of data can be 
accessed in bulk streaming operations rather than seeking through the 
entire disk...  Streaming transfer rates has been scaling better than seek 
times, which means that as time goes on it becomes more and more 
preferable to reduce the number of seeks for bulk operations...  As we go 
from cheap arrays of 12 250GB disks to cheap arrays of 12 2TB disks this 
problem just gets worse...

On Wed, 30 May 2007, dormando wrote:
> MogileFS currently doesn't put smaller blobs together into "chunks". mogtool 
> is able to turn large files into 64MB chunks, but not the other way around :)
> If you store a few TB of 10 kilobyte blobs then try to read them all back, 
> it'll be pretty slow.
> What you can do is fetch/process in parallel. If you have 10,000 small files, 
> stored with a mindevcount of at least 2, across 10 drives... You could fetch 
> at *least* ten processes in parallel doing storage lookups then grabbing the 
> data from the storage nodes. (same thing for storing data).
> -Dormando
> Lamont Granquist wrote:
>> Does MogileFS chunk storage like GoogleFS does?  One of the strengths of 
>> GoogleFS's design is that it writes out storage in 64MB chunks on the 
>> chunkserver.  If you've got thousands of little 10kB files they'll get 
>> coalesced into 64MB chunk files.  When a fileserver needs to be replicated 
>> the chunkfiles can be replicated which means you can stream 64MB at a 
>> time.  A back of the envelope calculation of the total time to stream 64MB 
>> is:
>> 4 ms + 64 MB / 30 MBs ~= 2 seconds
>> versus unchunked seeking every 10kB:
>> 6553 * 4 ms + 64 MB / 30 MBs ~= 26 seconds
>> So how does MogileFS do this?  If I fill up 2TB with 10kB blobs of data 
>> under MogileFS how long does it take to get all that data off the disk? 
>> Does it have to do a seek for every 10kB?

More information about the mogilefs mailing list