carl at immi.com
Thu Nov 30 20:32:14 UTC 2006
We are currently studying utilizing MogileFS for our application,
and I'm trying to figure out the best way to architect the system
from a hardware/software perspective. I have a few questions
regarding the overall architecture and perhaps best practices for
Consider this environment:
Potentially 3+ TB of new data monthly that will have X replicas of
the data stored. (X is still a number I'm trying to figure.. my
initial thoughts are 3 for a decent level of safety). There will be
multiple classes of files, each with their own storage requirements.
The data is generally written once, then read some number of times
based on the compute tasks necessary to deal with the data. After a
period of time (to be determined) the data is expired and cleaned up.
If I understand the documentation I've seen correctly, there is a
central MySQL DB (Clustered or not) that the trackers talk to. Then
the storage nodes just read/write based on what is given to them. Are
there limitations I should be aware of? Our current system has
roughly 200 million unique pieces of data stored as blobs in MySQL
(across multiple servers), this quantity of files won't be a problem
for MogileFS will it? In Mogile terminology there would be 10+
domains with differing numbers of classes within each domain
dependent on the parent class. So the 200 million files would be
spread out across multiple servers based on class and the rules of
replication for the class.
Hardware-wise, I'm looking at some relatively generic 4U servers with
8 x 750GB SATA-2 drives in them. How would they be best utilized? as
8 separate volumes or in a RAID 0 setup with some quantity of 1-2 TB
volumes created. I'm leaning towards the 8 volume thing.
Any thoughts or input would be greatly appreciated.
More information about the mogilefs