replication policy update..

Eric Lambrecht eml at guba.com
Sun Oct 15 23:30:31 UTC 2006


We're currently making some mods to mogile to help us push content 
around to different machines within our pool to deal with the fact that 
we've got some files that are extremely small and frequently accessed 
(thumbnails) and some other files that are extremely big and much less 
frequently accessed (original full size videos). Its somewhat pricey to 
get machines that can handle the gamut of stuff we store, so this is our 
attempt to split things around to different classes of machines and 
optimize how we store things.

Clearly we could separate out mogile into different clusters, but having 
them all in the same cluster with the ability to tweak where things go 
really simplifies our frontend when it accesses mogile.

Take a look at the description of how I want to tweak things and let me 
know if anybody has complaints/suggestions/comments. The implementation 
will fit within the new ReplicationPolicy framework and be completely 
backwards compatible.

--

Devices are now labelled with class preferences. If you specify
'foo', then that device will only store files in the 'foo' class.  A
device can prefer to hold multiple classes or any class. If you
specify 'foo; bar', then this device will only hold files in either
the 'foo' or 'bar' classes. A device with no label, or with the label
'any', will hold files from any class, just as the existing system
does.

We use this to store files that have different traffic and filesize
patterns on different hosts. Thumbnails are heavily hit by anonymous
visitors and very small, so we store them on a small set of very fast
(and expensive) machines with lots of RAM and not much storage in
them. Full length movies are very large files and are accessed
infreqently by a much smaller number of people who fork over their
credit card information, so they are stored on machines with much
denser (and slower, and cheaper) storage on them.

When determining where to put a file or a replica of a file, we first
try to store it on a machine that prefers the class of the file being
submitted. If that fails, we try to store the file on a machine that
will accept files of any class.

Additionally, we want to influence where we store the replicas of
files on different hosts. One reason for this is to make sure
that multiple copies of files are stored on machines on different
power strips. It does us no good if copy 1 and copy 2 are stored on
two hosts that are hooked up to the same dead power supply.

We specify where replicas of files should be stored by annotating the
devices with which replica they would like to store: '1', '2', '3',
e.t.c. A device labelled '1' would like to store the original instance
of a file. A device labelled '3' would like to store the 3rd instance
of a file. A device labelled '1 3 5' would like to store the 1st, 3rd,
and 5th instance of a file. A device with no label will store any
instance of a file.

By labelling all hosts on power strip A with '1 3 5' and all hosts on
power strip B with '2 4 6', we can be confident that if one of the
power strips goes away, we'll still have access to our files.

We can combine the two annotation methods to even further massage our
content: 'foo 1 2' is a host that wants the first and second copy of
'foo' files, and 'foo 3 4' is a host that wants the third and fourth
copies of 'foo' files.

More examples:

foo 1 2                 store the first and second instances
				of 'foo' files
bar                     store any copy of 'bar' files
any 1                   store the first copy of any file
1			store the first copy of any file
any                     store any copy of any file
foo 1; bar 2            store the first copy of 'foo' files
				and the second copy of 'bar' files
1 3                     store the first and third copy of any files

Specifically, the algorithm works like this, when trying to determine
where to store replica X of a file in class CLASS:

1. try to store the file in a device requesting files in class CLASS, 
replica X
2. try to store the file in a device requesting files in class CLASS
3. try to store the file in a device requesting replica X of any class
4. try to store the file in any device with no class or replica preferences
 

(TBD: if we're at step 1 and there is only one device that we could 
store the file on, but it is dead or unreachable, do we fall through to 
#2 or do we fail and try again later?)

Eric...


More information about the mogilefs mailing list