Very large numbers of files?

dormando dormando at rydia.net
Wed Sep 19 03:34:38 UTC 2007


Andrew Cantino wrote:
> I'm interested in using mogilefs to store very large numbers of files
> (tens to hundreds of millions).  Will this be a problem? 

Depends on how often you add, delete, re-replicate, read etc. mogilefs' 
choke point is currently the database. Which a nice dual cpu quadcore 
machine with 32G+ of RAM will likely hold hundreds of millions of files 
okay.

It also wouldn't be too huge of an investment to add database 
partitioning support.

I guess it's worth noting that the more actively your dataset is 
changing, the more load the system presents to itself in general. If you 
load up a hundred million files then mostly read on them most of 
mogilefs will be pretty bored. Doing more adds some extra DB load.

> Could anyone
> point me to benchmark documentation about how mogilefs scales as the
> number of stored files grows?

Wish there was some :) There's the one database, the rest is dependent 
on how many spindles you add to the cluster. Disks slow? Add more disks, 
rebalance, or drain overloaded devices, move on. Trackers overloaded? 
Add more trackers. Database slow? Possibly a bigger issue.

-Dormando


More information about the mogilefs mailing list