replication policy update..

Eric Lambrecht eml at
Mon Nov 13 18:41:20 UTC 2006

Brad Fitzpatrick wrote:
> Rather than modify the device table to have a "class preference" column or
> something very specific to your replication policy, let's go ahead and do
> the "host_meta", "device_meta" tables, which just look like, say:
> CREATE TABLE host_meta (
>   hostid,
>   metakey  VARCHAR(255),
>   PRIMARY KEY (hostid, metakey),
>   metaval  VARHCAR(255)
> )

Brilliant - much cleaner than what I was doing. I'll try to make the 
change today and start pushing code back to you.


> (same with device)
> So an arbitrary key/value dictionary on each device and host.  Then we
> make accessors to the metadata from the (very new) MogileFS::Device object
> and (future) MogileFS::Host object.
> And then it's those Device and Host objects which get passed into the
> replicator policies.
> So in your policy implementation you can do stuff like:
>    if ($host->meta("preferred_class") eq "...")
> If you go that route, I'm happy to merge the common bits.  I'm also
> willing to help hack it out, but not sure what you've done.
> On Sun, 15 Oct 2006, Eric Lambrecht wrote:
>>We're currently making some mods to mogile to help us push content
>>around to different machines within our pool to deal with the fact that
>>we've got some files that are extremely small and frequently accessed
>>(thumbnails) and some other files that are extremely big and much less
>>frequently accessed (original full size videos). Its somewhat pricey to
>>get machines that can handle the gamut of stuff we store, so this is our
>>attempt to split things around to different classes of machines and
>>optimize how we store things.
>>Clearly we could separate out mogile into different clusters, but having
>>them all in the same cluster with the ability to tweak where things go
>>really simplifies our frontend when it accesses mogile.
>>Take a look at the description of how I want to tweak things and let me
>>know if anybody has complaints/suggestions/comments. The implementation
>>will fit within the new ReplicationPolicy framework and be completely
>>backwards compatible.
>>Devices are now labelled with class preferences. If you specify
>>'foo', then that device will only store files in the 'foo' class.  A
>>device can prefer to hold multiple classes or any class. If you
>>specify 'foo; bar', then this device will only hold files in either
>>the 'foo' or 'bar' classes. A device with no label, or with the label
>>'any', will hold files from any class, just as the existing system
>>We use this to store files that have different traffic and filesize
>>patterns on different hosts. Thumbnails are heavily hit by anonymous
>>visitors and very small, so we store them on a small set of very fast
>>(and expensive) machines with lots of RAM and not much storage in
>>them. Full length movies are very large files and are accessed
>>infreqently by a much smaller number of people who fork over their
>>credit card information, so they are stored on machines with much
>>denser (and slower, and cheaper) storage on them.
>>When determining where to put a file or a replica of a file, we first
>>try to store it on a machine that prefers the class of the file being
>>submitted. If that fails, we try to store the file on a machine that
>>will accept files of any class.
>>Additionally, we want to influence where we store the replicas of
>>files on different hosts. One reason for this is to make sure
>>that multiple copies of files are stored on machines on different
>>power strips. It does us no good if copy 1 and copy 2 are stored on
>>two hosts that are hooked up to the same dead power supply.
>>We specify where replicas of files should be stored by annotating the
>>devices with which replica they would like to store: '1', '2', '3',
>>e.t.c. A device labelled '1' would like to store the original instance
>>of a file. A device labelled '3' would like to store the 3rd instance
>>of a file. A device labelled '1 3 5' would like to store the 1st, 3rd,
>>and 5th instance of a file. A device with no label will store any
>>instance of a file.
>>By labelling all hosts on power strip A with '1 3 5' and all hosts on
>>power strip B with '2 4 6', we can be confident that if one of the
>>power strips goes away, we'll still have access to our files.
>>We can combine the two annotation methods to even further massage our
>>content: 'foo 1 2' is a host that wants the first and second copy of
>>'foo' files, and 'foo 3 4' is a host that wants the third and fourth
>>copies of 'foo' files.
>>More examples:
>>foo 1 2                 store the first and second instances
>>				of 'foo' files
>>bar                     store any copy of 'bar' files
>>any 1                   store the first copy of any file
>>1			store the first copy of any file
>>any                     store any copy of any file
>>foo 1; bar 2            store the first copy of 'foo' files
>>				and the second copy of 'bar' files
>>1 3                     store the first and third copy of any files
>>Specifically, the algorithm works like this, when trying to determine
>>where to store replica X of a file in class CLASS:
>>1. try to store the file in a device requesting files in class CLASS,
>>replica X
>>2. try to store the file in a device requesting files in class CLASS
>>3. try to store the file in a device requesting replica X of any class
>>4. try to store the file in any device with no class or replica preferences
>>(TBD: if we're at step 1 and there is only one device that we could
>>store the file on, but it is dead or unreachable, do we fall through to
>>#2 or do we fail and try again later?)

More information about the mogilefs mailing list