MogileFS efficiency and external syncronization
dormando
dormando at rydia.net
Wed Jun 28 05:35:33 UTC 2006
Yo,
We want to use MogileFS for more images, or to at least store and manage
master copies of files. I have a feeling our load pattern would not work
well with MogileFS though.
Right now we have a code repo for high traffic static images, which gets
syncronized to ramdisks on machines running lighttpd. Running them
directly out of MogileFS is tempting since we can allow the artists easy
upload/management and avoid nasty crap.
From what I've measured I really don't want to have small numbers of
images hit extremely hard with the MogileFS setup, which for a well
cached image:
user -> firewall -> perlbal -> webapp -> memcached
Then back up through perlbal -> mogstored -> user
Lots of roundtripping. Lots of headers/protocols. Loading 30 images on a
page will take noticably longer. Also, for our larger images with tons
of hits, we'd have to set a mindevcount of 6+ to ensure the mogstored's
could handle the load.
The graphics servers are just:
user -> firewall -> graphics (PF load balancing + a health check daemon
I wrote).
So what I would like to do is add a cronjob to our graphics servers that
can syncronize a domain to local memory.
If I'm understanding everything correctly, every time a key gets updated
with a new file it gets a new fid? That fid should be a
autoincrement-alike, so it will be the next highest in the series.
So I add a tracker command that allows me to dump all keys in a domain,
and a command that dumps all keys in a domain with a fid greater than N.
Then I should be able to use those to ensure what's in mogilefs is what
I have on disk?
Those queries won't be incredbly fast however I don't intend to use them
on domains with more than a few thousand keys, and can add an index if
absolutely necessary.
Are there any better approaches which are roughly as simple? Perhaps a
way to make the mogstored hits not as nuts?
-Dormando
More information about the mogilefs
mailing list