dormando at rydia.net
Thu May 31 17:31:26 UTC 2007
> That's a very interesting trade off. If you've got a RAID array you can
> fail a drive and take zero impact, but then you've got the additional
> complexity of the RAID controller where failure of the entire array is a
> much worse failure case.
Saves money to just JBOD it for me... Use the expensive BBU guys in the DBs.
> You still do have the possibility of failure of the whole machine. You
> can scramble and swap drives into a working box (or boxes) though. I
Have slightly more, less dense machines? Instead of having "cold spares"
or spare parts for my mogilefs cluster laying about, I just put them all
into production. So in case of a few machines failing over the course of
a month I don't have to do anything at all.
> Another question... Does MogileFS duplicate entire servers to other
> entire servers or does it use an entirely random allocation pattern?
> The concern with the random allocation pattern is that if you've got
> mindevcount of 2 that statistially you will lose some data every time
> you crash two machines at the same time...
See the replication policy code. It's pluggable now, but honestly I
haven't gone through that code yet. By default it'll balance loosely
based on ioload, free disk space, and try to make sure the replicants
are on different physical hosts. What you describe is also a problem
with R5 arrays anyway :) At a previous job someone had the genius of
making a 48 disk RAID5 array. Bad idea! Same as having a mindevcount of
2 on important files over 48 machines with no outside backup.
io's/sec are always going to kill you before you run out of space. Buy
slightly larger disks and crank up the devcount.
> For the apps that I'm considering the small blobs would need to be
> randomly accessed with a very fast SLA so I can't pull 64MB to get at
> 10kB inside of it.
A perlbal plugin that allows seeking into chunks based on the URL would
be fine... I'm concerned however that you just contradicted your load
What exactly is your load pattern expected to be? :) Chunking won't help
if you have to access a small file from 500 different chunks in
different places. Chunking will only help if you're mass-processing
data. Such as in data mining, text indexing, image analysis, small file
backups, blah blah. This should be true for both MogileFS and GoogleFS.
More information about the mogilefs