File duplication question

max goldberg max.goldberg at gmail.com
Mon Jan 9 18:33:03 UTC 2006


Hello all,

I am currently running a site with around 150gb of content with a single
beefy content server that is doing around 20,000 requests per minute at peak
and around 14,000 per minute average on around 800,000 files. I am looking
to dedicate some more hardware to the project and as memcached has proven
invaluable to the growth of my site, I figured I would investigate MogileFS
some more.

I've spent a lot of time looking at danga.com/words/ and have found some
pretty valuable information there but I have a few questions.

Currently I use an MD5 hash for the file name and a database back end to try
and keep track of all the files. One of the really nice things about this
system is that on my site file assets are duplicated a lot. Some assets are
duplicated up to a thousand times. Using the MD5 setup, I can create
`symbolic links` of sorts in the database to keep from having to duplicate
on file space (and in turn cause more I/O to my web server).

Is this sort of thing possible with Mogile? I haven't been able to find much
documentation on the structure of the DB, so I can't tell if this is a
standard operation or more of a "hack".

Can anyone provide any insight?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/mogilefs/attachments/20060109/a29e2409/attachment.htm


More information about the mogilefs mailing list