Announce: MogileFS 2.09 & Fsck worker

Brad Fitzpatrick brad at
Fri Apr 20 09:36:49 UTC 2007


Hot out of the oven:




(Also all uploaded to CPAN, but not yet indexed)

What's new?

A filesystem checker!

  16199 pts/9    S+     0:00     \_ /usr/bin/perl ./mogilefsd
  16201 pts/9    S+     0:00         \_ ./mogilefsd [replicate]
  16202 pts/9    S+     0:00         \_ ./mogilefsd [replicate]
  16205 pts/9    S+     0:00         \_ ./mogilefsd [replicate]
  16206 pts/9    S+     0:00         \_ ./mogilefsd [delete]
  16207 pts/9    S+     0:00         \_ ./mogilefsd [queryworker]
  16208 pts/9    S+     0:01         \_ ./mogilefsd [monitor]
  16209 pts/9    S+     0:00         \_ ./mogilefsd [reaper]
  16210 pts/9    S+     0:00         \_ ./mogilefsd [fsck]      <--- see?

It hangs off the mogilefsd parent process (always one process, not
configurable), and you control it via the protocol, which
MogileFS::Admin (in MogileFS-Client package) has been updated to do,
and then mogadm has been updated to use the new MogileFS::Admin...

$ mogadm fsck
Help for 'fsck' command:
 (enter any command prefix, leaving off options, for further help)

  mogadm fsck clearlog                       Clear the fsck log
  mogadm fsck printlog                       Display the fsck log
  mogadm fsck reset [opts]                   Reset fsck position back to the beginning
  mogadm fsck start                          Start (or resume) background fsck
  mogadm fsck status                         Show fsck status
  mogadm fsck stop                           Stop (pause) background fsck
  mogadm fsck taillog                        Tail the fsck log

$ mogadm fsck status

    Running: No
     Status: 13689 / 13689 (100.00%)
       Time: 46s (5 fids/s; 0s remain)
 Check Type: Normal (check policy + files)

 [num_GONE]: 872
 [num_MISS]: 1714
 [num_NOPA]: 15
 [num_SRCH]: 872

It can check either just replication policy (if you change classes or
mindevcounts, etc, make sure the assumed locations are correct), or the
default check type is "normal", which checks that each file is where it's
supposed to be (using mogstored sidechannel and/or HTTP HEAD requests),
and the right size.

The events you'll see in the log:

use constant EV_NO_PATHS         => "NOPA";
use constant EV_POLICY_VIOLATION => "POVI";
use constant EV_FILE_MISSING     => "MISS";
use constant EV_BAD_LENGTH       => "BLEN";
use constant EV_CANT_FIX         => "GONE";
use constant EV_START_SEARCH     => "SRCH";
use constant EV_FOUND_FID        => "FOND";
use constant EV_RE_REPLICATE     => "REPL";

In summary:

* NOPA -- no paths were in file_on in mysql.  in this case,
  fsck proceeds to search all alive devices, trying to find
  a copy, somewhere, that's good.

* POVI -- policy violation.  the file_on rows don't satisify the file
  class' replication policy.  either on wrong disks, or not enough,
  or both.

* SRCH -- a search over all devices has started.  this can
  happen either because of NOPA, or because all the file_on
  records pointed to hosts that were 404ing.  in any case,
  it's a last-ditch effort that'll probably never succeed,
  but maybe!  especially if you were trying to do something
  tricky rsyncing files around and somehow failed or messed
  something up at some point.  it'll find those copies.

* MISS -- a file_on location (which was supposed to exist) was
  checked to see if it did indeed still exist, and 404'd.  note
  that if the webserver or host is down, the fsck just stalls
  and waits... in the future it'll be better, but right now
  it only checks one fid at a time, and pauses on any connectivity
  issues.  the fsck log record will have the devid that missed,
  so you can detect trends (is it always that disk? etc)

* BLEN -- like MISS, but bogus-lengthed file instead of missing.

* GONE -- after SRCH failed and we can't do anything more.  this
  means a file is totally gone & can't be found.  note that
  the file table and file_on tables aren't cleaned (to be safe,
  in case fsck itself is buggy), so next fsck will find this problem
  again.  maybe in future, it'll be an option to auto-clean, or
  separate command.  who knows.  for now you can do it yourself.

* FOND - during a SRCH, an actual copy was found!  log includes

* REPL - there was at least one good copy somewhere, and file was
  scheduled for re-replication to as to satisify the needed
  replication policy

This is a very early version, but it seems to work.  Now that releases
are painless for me since I finally automated my release tools, I
wanted to get it out there early for feedback.

Future version will:

 -- have regression tests (sorry about this version)

 -- be parallelizable, thus faster (didn't want to worry about that
    communication & locking yet.. that's tedious/boring.. wanted
    to get checking/fixing logic correct first)

 -- ...

Anyway, please try it out.

Any questions/feedback appreciated....

- Brad

More information about the mogilefs mailing list