File duplication question

Brad Fitzpatrick brad at danga.com
Mon Jan 9 21:01:17 UTC 2006


max goldberg wrote:
> One thing mentioned in quite a few of your presentations is using more 
> than one mysql server as the "master" because switching from slaves 
> takes too long 

I don't recall ever saying that as the reason.  MySQL has no master 
election.  That's the bigger problem.  Switching isn't slow, as much as 
it's just not 100% reliable to have the switch be automatic and have the 
slaves still replicating correctly.

> and the cluster product being too unstable to work with, 

Nor that.  It's in-memory.  If you don't have 700 GB of memory, that 
pretty much limits your options for a 700 GB database.  For some 
applications, MySQL Cluster would be perfect.

> not to mention having multiple slaves slows things down exponentially. 

Well, kinda.  I said you get diminishing returns per-slave as you add 
more slaves to a cluster with an increasing number of writes.

> Is there anything you can point me to that explains how you this, or is 
> it something as simple as duplicating all writing queries to each master?

Your application doesn't duplicate writes... you setup MySQL to do it w/ 
its async replication, or use DRBD to have a pair of machines acting as 
one highly-available MySQL.

> 
> Thanks again!
> 
> On 1/9/06, *Brad Fitzpatrick* <brad at danga.com <mailto:brad at danga.com>> 
> wrote:
> 
>     That's exactly what LiveJournal's products do, so we don't store
>     duplicated contents in MogileFS.
> 
> 
>     On Mon, 9 Jan 2006, Justin Azoff wrote:
> 
>      > max goldberg wrote:
>      > > Hello all,
>      > [ snip :-) ]
>      > > Currently I use an MD5 hash for the file name and a database
>     back end to
>      > > try and keep track of all the files. One of the really nice
>     things about
>      > > this system is that on my site file assets are duplicated a
>     lot. Some
>      > > assets are duplicated up to a thousand times. Using the MD5
>     setup, I can
>      > > create `symbolic links` of sorts in the database to keep from
>     having to
>      > > duplicate on file space (and in turn cause more I/O to my web
>     server).
>      > >
>      > > Is this sort of thing possible with Mogile? I haven't been able
>     to find
>      > > much documentation on the structure of the DB, so I can't tell
>     if this
>      > > is a standard operation or more of a "hack".
>      > >
>      > > Can anyone provide any insight?
>      >
>      > I'd say that it isn't possible using only mogilefs, but you
>     should still
>      > be able to use it..  You could keep your existing database of
>     filename
>      > -> md5sums and then add files to mogilefs with the md5sum as the key.
>      > This way mogilefs only knows about the unique files, and your other
>      > database takes care of the filename mappings.
>      >
>      > If you use memcached to cache the file -> md5sum and md5sum ->
>     mogstored
>      > path mappings your app should run nice and fast.
>      >
>      > --
>      > - Justin
>      >
>      >
> 
> 



More information about the mogilefs mailing list