Announce: mogilefs-server 2.15, aka "fsck, rebalance, drain"

Brad Fitzpatrick brad at
Mon May 7 19:58:42 UTC 2007


I'm pleased to announce an exciting new MogileFS release.  The Changelog,
while exciting, is long, so I'll summarize:

   * much, much faster fsck (if using new mogstored, else
     same speed as before)

   * rebalance support (replicators, when idle, can optionally rebalance
     the state of the world).  to use, with new mogadm command:

        mogadm settings set enable_rebalance 1

     see if it's still running with:

        mogadm settings list

     (it turns itself off when done)

   * new device state "drain", like mix of dead & readonly.  still serve
     traffic, but don't get new files.  replicators, when idle, migrate
     files off those disks.

   * mogstored can use lighttpd instead of Perlbal now, including
     auto-configing lighttpd, and managing it as a child process.
     passes tests at least, but could use field reports from you
     guys using it in production.  (this previously worked with trickery,
     but now it works out-of-the-box...)

And internals-wise:

   * tons of code cleanup, new objects, new tests, lots of new comments
   * bug fixes (especially corner cases with MultipleHosts
     replication policy)
   * DeviceState abstraction
   * ReplicationRequest abstraction
   * RebalancePolicy abstraction (complementing ReplicationPolicy)

I think you'll all really enjoy this release.  Please report any
questionable behavior, or just ask questions in general.

Unfortunately (or fortunately), this release has some new dependencies
which made our life much easier and code much cleaner to add some new
stuff to mogstored... Gearman server/clients are now required, as the
mogstored now has an embedded gearman server in it, for child processes
doing work.  This will also be used in mogilefsd soon, for upcoming
pure-HTTP support, so we didn't feel bad adding the dependency already.

If you want, you can only upgrade the mogilefs-server and use the old
mogstoreds, then you don't need Gearman stuff yet.  (you'll need it in
future upgrades, though)  But note, then you won't have fast fsck.


new mogadm (in MogileFS-Utils package)

new mogilefs-server:
   * Danga::Socket >= 1.56, and Perlbal >= 1.53.

If you want mogstored to use less CPU, you'll also want Perlbal::XS::HTTPHeaders:

Full changelog:

2007-05-07: Release version 2.15 ("fsck/drain/rebalance")

        * minor bug fixes and enhancements for MultipleHosts replication
          policy.  bunch of corner cases now checked with new tests.
          now easy to write more replication policy tests in future.

        * replication policies can now return "desperate" requests,
          signalling that a replication reassessment should be
          enqueued for the future, to see if things could be improved.
          (that part's not currently implemented, but the real feature
           and motivation is that the rebalancer now won't delete a
           DevFID if it results in a desperate move, only an ideal move. )

        * replication policies now can optionally return a new return
          value of the (new) type MogileFS::ReplicationRequest, which
          has pretty accessor names, can suggest multiple places,
          can indicate non-ideal emergency replication decisions.
          old plugins' return values will be transparently upgraded
          to the equivalent new return value objects.

        * adding new device state: "drain".  it's a hybrid of "dead"
          (in that files are migrated off it) and "readonly", it that
          it still serves traffic... it just doesn't get new files.
          this also introduces the new object-oriented DeviceState class,
          and device_state($name) utility function to get the DeviceState
          singleton by name

        * internal code cleanup.  notably, kill the old & nasty legacy
          'find_deviceid' function which was ridiculously long and hairy.
          the two callers are now more readable with sorts/greps/etc.

        * make mogstored's devN/usage writing process (DiskUsage) be less
          racy with the mogilefsd monitoring code... don't open file for
          write... open read/write, then in one write system call, write
          the entire file, with newline padding at end to cover old data,
          then truncate it if necessary.  should remove harmless (but scary)
          error messages previously reported by the mogilefsd monitor
          about zero-length usage files.

        * new protocol commands to list/set (certain) server settings,
          with value sanity checking (see MogileFS::Config for which
          are settable, and with what values).  needed for "enable_rebalance".
          was partially enabled before for slave settings.  also needed
          for memcached support before, which was never possible to
          set with mogadm, only with db tweaking.

        * make mogilefsd fsck use new mogstored fid_sizes command, to
          do bulk stats.  speeds up fscks a ton.

        * be robust against system clocks that go backwards between
          gettimeofday calls:

        * Put gearman server in mogstored process, add worker
          'mogstored-fidsizes' which runs as subprocess of mogstored. Add
          side-channel command 'fid_sizes' which allows us to quickly enumerate
          and get sizes for files across entire devices on a storage node.

        * remove all code like $state eq "readonly", $state=~ /^dead|down$/
          and instead convert it into specific questions on policy/traits
          of given state, like $dev->should_put_new_files_on, or
          $dev->should_drain_files_off.  see MogileFS::DeviceState,
          objects of which are accessed via $dev->dstate, or new
          MogileFS::Util device_state($name) wrapper.

        * start of rebalance support.  (where replication workers, in their
          idle time, can rearrange files to even out disk space and/or IO
          activity on storage nodes... policy isn't hard-coded, and is
          in fact currently random)

        * lighttpd support in both mogilefsd and mogstored.  passes test
          suite with environment MOGSTORED_SERVER_TYPE=lighttpd set now.

        * abstract out the HTTP server support in mogstored, so
          mogstored isn't just a perlbal wrapper, but an anything
          wrapper. (in particular, lighttpd and apache)  mogstored still
          exists for all its other misc admin/monitoring functions,
          but can then manage/configure apache/lighttpd child process(es).
          so far they're just stubbed out.

        * split mogstored into separate files per class, rather than one
          large script.



More information about the mogilefs mailing list