MogileFS Scalability

Brandon Ooi brandon at hotornot.com
Wed Aug 24 23:31:44 PDT 2005


Hi,

I have a couple questions regarding mogilefs scalability. While it's 
easy to expand the size of the filesystem, it's not so straightforward 
to increase bandwidth.

It would seem like the first thing to be a bottleneck would be the 
database. One option was to use Mysql Cluster. While it's a great 
technology, I just don't know how mature it is. What are other people's 
experience using Mysql Cluster.

The other option would be to use a cluster of replicated databases and 
the innodb engine. However, it would seem like all trackers (mogilefsd 
instances) would have to point to the master database because they may 
want to do writes? In an environment where most requests are reads, 
replication doesn't really buy you anything aside from a hot backup and 
you're still stuck on the master db.

I guess one solution would be to build an application level distinction 
between reads and writes and access different trackers. Reads pointing 
to any tracker and writes pointing to trackers on the master db.

Assuming the database is not the bottleneck, there may be a large 
difference in the access patterns between rare files and really popular 
files. Does increasing the min_dev_count for popular files also increase 
potential bandwidth? If so, what are your experiences with this?

Brandon


More information about the mogilefs mailing list