File duplication question

Brad Fitzpatrick brad at danga.com
Mon Jan 9 19:23:43 UTC 2006


That's exactly what LiveJournal's products do, so we don't store
duplicated contents in MogileFS.


On Mon, 9 Jan 2006, Justin Azoff wrote:

> max goldberg wrote:
> > Hello all,
> [ snip :-) ]
> > Currently I use an MD5 hash for the file name and a database back end to
> > try and keep track of all the files. One of the really nice things about
> > this system is that on my site file assets are duplicated a lot. Some
> > assets are duplicated up to a thousand times. Using the MD5 setup, I can
> > create `symbolic links` of sorts in the database to keep from having to
> > duplicate on file space (and in turn cause more I/O to my web server).
> >
> > Is this sort of thing possible with Mogile? I haven't been able to find
> > much documentation on the structure of the DB, so I can't tell if this
> > is a standard operation or more of a "hack".
> >
> > Can anyone provide any insight?
>
> I'd say that it isn't possible using only mogilefs, but you should still
> be able to use it..  You could keep your existing database of filename
> -> md5sums and then add files to mogilefs with the md5sum as the key.
> This way mogilefs only knows about the unique files, and your other
> database takes care of the filename mappings.
>
> If you use memcached to cache the file -> md5sum and md5sum -> mogstored
> path mappings your app should run nice and fast.
>
> --
> - Justin
>
>


More information about the mogilefs mailing list