serving configuration

Mark Smith smitty at gmail.com
Mon May 5 23:26:41 UTC 2008


> Ah, yes, of course. It had become so ingrained in me not to use my app layer
> to serve static files I didn't even consider that -- I was thinking, boy,
> routing from lighttpd -> perlbal -> web app --> mogilefsd  and back again
> doesn't make any sense!

Fair enough.  Even those these files are static, the act of figuring
out where in Mogile they are isn't!  :)

> What about caching? In the excellent thread about serving best practices,
> there were a number of places for memcached to be plugged in -- obviously
> it'd be pretty easy to have the app store the key/path(s) in memcached, but
>   * I'm guessing you'd need to store all the possible paths in memcached, in
> case one of them is invalid so perlbal can go down the list?
>  * Are there other places where perlbal can link directly to memcached
> (without going through the app layer)?

There are a few really obvious places to cache, and it really depends
on your particular use case.  Briefly:


1) cache at the load balancer

If you expect fewer files that will be hit lots of times (thousands+)
then this could be a good solution.  (But then, you probably don't
want MogileFS if your usage pattern is "ten thousand files, ten
million hits each".)


2) caching in Perlbal

This actually works out pretty well if you have a sort of long tail
behavior of file access.  (I.e., lots of files rarely get loaded, some
files get loaded a lot.)  There are some Perlbal options that you can
use to turn on caching of reproxy URLs.

Caveat: this doesn't work if you have security on the resources you
want to cache!  Perlbal doesn't know anything about whether the viewer
SHOULD see it or not.  If you need that functionality, you can write a
plugin to do the caching for you, or read on.


3) application level caching

LiveJournal uses this to store the lookup paths.  For example, here's the code:

    http://code.sixapart.com/trac/livejournal/browser/trunk/cgi-bin/Apache/LiveJournal.pm#L1060

Line 1057 is where the process of serving a userpic (a little image)
out of MogileFS starts.  It checks memcache first to see if we know
about the paths to this picture.  If we do, then we serve from it.  If
not, then it calls MogileFS and gets the paths, then stuffing that
data into memcache.


4) MogileFS caching

This is an option as well.  I've not personally played with it much,
but you can read more about it here:

    http://code.sixapart.com/trac/mogilefs/browser/trunk/server/doc/memcache-support.txt


5) missing?

I'm probably missing something, and I haven't talked about CDNs or
caching layers like Squid/etc... those fit in sure, but are more
specific in the case they solve.  For the general purpose MogileFS
installation, you likely won't use or need them.


-- 
Mark Smith / xb95
smitty at gmail.com


More information about the mogilefs mailing list