Using MogileFS as a general replicated storage filesystem

Robin Smidsrød robin at smidsrod.no
Thu Mar 16 21:25:54 UTC 2006


I'm new to the list, so let me just try to bug you with problem I've 
been trying to find an adequate solution to for several years.

I have been collecting music and digital photographs for several years, 
and I think I have around 50000 files or so lying around on my machines. 
Organizing and indexing this amount of files (for one person) is a HUGE 
job, and I've been looking for tools to do the job right. Once in a 
while I find a better tool to do the job, but they all work in somewhat 
the same way: They expect to see the files in a traditional filesystem 
(POSIX/VFS whatever it's actually called, I'm not 100% sure) like NFS or 
SMB. The tools do both reads and writes, locking and other common 
filesystem operations that you expect from these kind of tools.

Once in a while I tend to get new (actually old) computing hardware. It 
could really be anything, from a P133 with 500MB harddrive to a blazing 
server with 1GHz cpu and several GB's of storage. At the moment I have 
approx 1.1TB og physical storage split across 11 machines and maybe 20 
harddrives. The largest harddrive is 250GB and the smallest is 160MB. 
When a defective component is encountered, it will not be replaced by a 
new drive of the same size, it will be thrown out (unless there is still 
some warranty left on it). If a machine falls apart because of a 
defective motherboard, the harddrives will be moved into another 
machine. If I notice that I'm short on space or is nearing a low replica 
count I could buy a couple of new drives and mount them into my existing 
machines. When my machines are full of drives I can either toss out some 
of the small-sized harddrives or maybe buy another cheap machine that 
could handle some more drives. It's all a matter of cost an flexibility.

As you can understand, there is no way to use these odd-sized 
harddrives/machines to form any kind of standardized RAID solution. At 
least not in a way to utilize the entire physical storage capacity and 
keeping the cost down. That is until I read about the Google FileSystem 
in some whitepaper a while ago (last year, I think). I thought to 
myself: "Wouldn't it be nice to have a general purpose filesystem that 
can store traditional files, and will replicate a file automatically if 
the replica count got low?" You would probably answer that this already 
exist in the AFS (OpenAFS) filesystem, and you're probably right.

But there is one little problem with AFS. AFS expect you to setup 
volumes and store all your files in separate volumes, and then you 
replicate the VOLUMES, not the files. That is not the way I'd like it to 
behave. What I envision is one big filesystem that automatically grows 
and shrinks as the total number of disks are either added or removed 
from the cluster. Doing a 'df' should probably give you the combined 
space of all the disks in your cluster, that way you wouldn't need to to 
any fancy math to determine actually how much space you have left. If 
your replica count was 2 for a specific file, you would obviously loose 
twice the amount of space of the file once it was stored. As far as I 
can tell, I can't see any problem in limiting the solution to actually 
allow storage of a file on a single storage node. People that need to 
store multi-terabyte sized single files can probably afford expensive 
storage solutions aswell. To split up a single file across several nodes 
will only complicate things even more.

So you can understand that I got very excited when I noticed MogileFS in 
a comment on archive.org's pages about the PetaBox. It looked like 
someone had finally implemented something similar to the GoogleFS in 
Open Source. I got very curious indeed. After looking over the website 
for MogileFS I noticed that it wasn't a traditional filesystem, but it 
had a lot of similarities. I noticed that it was implemented in Perl (my 
language, perfect!) and I got even more curious. But then I got sad 
again when I noticed that you tried implementing a wrapper with FUSE, 
and you kinda gave up on it.

I would love to be involved in trying to get MogileFS to do something 
like what I have talked about above, but unfortunately my skills are a 
bit limited. I've been programming Perl for several years (actually 
programming a commercial application in Perl as a day-job), and I have 
some experience with shell-scripting, but my C/C++/Java/kernel skills 
are extremly basic. I'm pretty competent at compiling/making/installing 
ready-written software in those languages, but actually writing them 
myself is not where I have my skillset.

What kind of obstacles are in the way to actually implement something 
like the solution I talk about above?

Hope to hear back from you soon.

Regards,
Robin Smidsrød

PS: If you need more input on the architecture of the storage system or 
anything, please don't hesitate to ask.
PPS: Got accounts on several IM networks if someone would like to chat 
more directly. Ask for contact details if you prefer that method.



More information about the mogilefs mailing list