Using MogileFS as a general replicated storage filesystem
robin at smidsrod.no
Thu Mar 16 21:25:54 UTC 2006
I'm new to the list, so let me just try to bug you with problem I've
been trying to find an adequate solution to for several years.
I have been collecting music and digital photographs for several years,
and I think I have around 50000 files or so lying around on my machines.
Organizing and indexing this amount of files (for one person) is a HUGE
job, and I've been looking for tools to do the job right. Once in a
while I find a better tool to do the job, but they all work in somewhat
the same way: They expect to see the files in a traditional filesystem
(POSIX/VFS whatever it's actually called, I'm not 100% sure) like NFS or
SMB. The tools do both reads and writes, locking and other common
filesystem operations that you expect from these kind of tools.
Once in a while I tend to get new (actually old) computing hardware. It
could really be anything, from a P133 with 500MB harddrive to a blazing
server with 1GHz cpu and several GB's of storage. At the moment I have
approx 1.1TB og physical storage split across 11 machines and maybe 20
harddrives. The largest harddrive is 250GB and the smallest is 160MB.
When a defective component is encountered, it will not be replaced by a
new drive of the same size, it will be thrown out (unless there is still
some warranty left on it). If a machine falls apart because of a
defective motherboard, the harddrives will be moved into another
machine. If I notice that I'm short on space or is nearing a low replica
count I could buy a couple of new drives and mount them into my existing
machines. When my machines are full of drives I can either toss out some
of the small-sized harddrives or maybe buy another cheap machine that
could handle some more drives. It's all a matter of cost an flexibility.
As you can understand, there is no way to use these odd-sized
harddrives/machines to form any kind of standardized RAID solution. At
least not in a way to utilize the entire physical storage capacity and
keeping the cost down. That is until I read about the Google FileSystem
in some whitepaper a while ago (last year, I think). I thought to
myself: "Wouldn't it be nice to have a general purpose filesystem that
can store traditional files, and will replicate a file automatically if
the replica count got low?" You would probably answer that this already
exist in the AFS (OpenAFS) filesystem, and you're probably right.
But there is one little problem with AFS. AFS expect you to setup
volumes and store all your files in separate volumes, and then you
replicate the VOLUMES, not the files. That is not the way I'd like it to
behave. What I envision is one big filesystem that automatically grows
and shrinks as the total number of disks are either added or removed
from the cluster. Doing a 'df' should probably give you the combined
space of all the disks in your cluster, that way you wouldn't need to to
any fancy math to determine actually how much space you have left. If
your replica count was 2 for a specific file, you would obviously loose
twice the amount of space of the file once it was stored. As far as I
can tell, I can't see any problem in limiting the solution to actually
allow storage of a file on a single storage node. People that need to
store multi-terabyte sized single files can probably afford expensive
storage solutions aswell. To split up a single file across several nodes
will only complicate things even more.
So you can understand that I got very excited when I noticed MogileFS in
a comment on archive.org's pages about the PetaBox. It looked like
someone had finally implemented something similar to the GoogleFS in
Open Source. I got very curious indeed. After looking over the website
for MogileFS I noticed that it wasn't a traditional filesystem, but it
had a lot of similarities. I noticed that it was implemented in Perl (my
language, perfect!) and I got even more curious. But then I got sad
again when I noticed that you tried implementing a wrapper with FUSE,
and you kinda gave up on it.
I would love to be involved in trying to get MogileFS to do something
like what I have talked about above, but unfortunately my skills are a
bit limited. I've been programming Perl for several years (actually
programming a commercial application in Perl as a day-job), and I have
some experience with shell-scripting, but my C/C++/Java/kernel skills
are extremly basic. I'm pretty competent at compiling/making/installing
ready-written software in those languages, but actually writing them
myself is not where I have my skillset.
What kind of obstacles are in the way to actually implement something
like the solution I talk about above?
Hope to hear back from you soon.
PS: If you need more input on the architecture of the storage system or
anything, please don't hesitate to ask.
PPS: Got accounts on several IM networks if someone would like to chat
more directly. Ask for contact details if you prefer that method.
More information about the mogilefs