MogileFS summit review

Jay Buffington jaybuffington at gmail.com
Thu Sep 21 22:30:19 UTC 2006


Hint: don't press tab then space in gmail until you're done writing the email.

Here are my notes.  Please feel free to add comments/corrections.

-lots of people (25-30) showed up
-most people are using mogile in development, but not in production
-much more documentation is needed before more people will start using this.

-Guba is making heavy use of mogile
 * they had a big problem with large (2.5 gb files) getting written
hundreds of times
    . this was because the size of the bytes written was reported
incorrectly, so mogile retried
 * mogile was designed for small (< 5 mb) files.
 * some people are using chunking for this, guba isn't.

-we talked about range requests...
 * its possible and probably a cool feature
 * not supported yet?

-Jonathan gave a pretty good demo of mounting Mogile (using WDFS)
  * need's webdav support
  * This is useful for management tasks
  * about 70% done, I'd love to see this released

-we need stat calls built in
  * I thought this should be everything that stat(2) returns
     . but a lot of fields (like inode, device, etc) don't make sense
     . implement as many that make sense (like file size, last access)

-Mark "Junior" Smith gave a demo of how plug ins work
  * I was a little unclear what the plugin he demoed did
  * if you write your own upload it to CPAN as MogileFS::Plugin::YouPluginName
  * You can write your own plug in for replication policies
     . eg, don't replicate files to machines on the same power strip
  * more documentation on what hooks there are needs to be written
      . we really need to do a better job on the wiki
      . wikis are great, but someone needs to organize them

-Database abstraction is a good thing for mogile
  * I looked into oracle support and it looks like I just need to
change these things:
       . REPLACE INTO is a mysql-ism and is used in a couple of places
       . validate_dbh() needs to be generic
       . there are hints like /*!40000 SQL_CACHE */ do these work in Oracle?
       . autoincrement used on table.  replace with sequence?
       . more that I probably haven't found yet
  * SQLite would be nice to lower barrier to entry

-we discussed webdav support
 * supported with lighttp and apache mod_dav
 * extending mogstored to support existing API and webdav?

-hardware suggestions
 * we asked that everyone send around their configuration
 * I promise to do this as soon as we buy hardware (I'm not running
it in prod yet)

-documentation
 * this would really help get more people involved
 * I'll write some POD for all the classes (I already started this)
 * Brett's How-To is an excellent start (http://durrett.net/mogilefs_setup.html)
 * I'd like to create a screencast demoing mogilefs

-tests
 * 2.0 has the beginnings of a test suite
 * you can never have too many tests.

-load testing
  * burn in a server using bonnie (http://www.textuality.com/bonnie/)
  * I suggested using JMeter (http://jakarta.apache.org/jmeter/)
      . I will write a JMeter script for this and publish it
      . really useful for benchmarking against other storage systems
  * someone also suggested push to test (http://pushtotest.com/)
  * published statistics would be nice

-monitoring
  * lots of people use these:
     . cacti       http://cacti.net/
     . ganglia   http://ganglia.sourceforge.net/
     . collectd  http://collectd.org/
  * I'd love to see a wiki article explaining how this works

-xen and vmware images
 * these would be nice to help people get started

-currently no way to change a class
-currently no job that cleans up replicated copies
  * if you decrease mindevcount, files before the change don't get reaped

-new features in 2.0
  * see brad's slides for a list of these
  * we talked a lot about a file system check (fsck) job
      . combined with meta files this could rebuild DB

- backing up mogile
  * mogile is the back up
  * but some people (me) want it stored offsite on tape just in case
  * Brad mentioned something I forgot about how to handle this
     . something about just back up fids larger than the the largest
one the last time you backed up?

- wishlist
  * automounting is another thing that would lower the barrier of entry
  * rebalance worker for when all your servers fill up and a you add a
new empty one

-file descriptor limit in RedHat is too small?
  * needs to be a test for this?

-Huge datasets (how scalable is Mogile)
  * i have 200 million images to put into Mogile
     . 4 "sizes" (thumbnail, screennail, orig, etc) mean almost a billion keys
  * the database would grow huge (mysql cluster runs out of memory)
  * I could partition the database
  * or I could write a plug in for dynamic keys
     . say the key for the thumbnail is 1234
     . the key for the screennail (which doesn't have a db entry) is 1234s
     . I like this idea the best

-meta data
  * for every file like 123456789.fid there would be a 123456789.meta
  * it would contain things like the key so fsck could rebuild
database if necessary.
  * the API would allow you store any other data you want in this file as well
     . like a photo site could store which "size" this is (thumbnail,
original, etc)

The meeting lasted about four hours.  My notes sucked, so please help
add stuff I
forgot.

This event got me totally motivated.  I'm working on documenting the
modules with  POD so I can understand how the internals work and then
I can start hacking on it.

Thanks to Brad and Six Apart for hosting this event.  They provided
Pizza and drinks for the event and it was much appreciated.


Jay


More information about the mogilefs mailing list