Announce: mogilefs-server 2.17

Robin H. Johnson robbat2 at gentoo.org
Fri Jul 13 23:27:56 UTC 2007


On Fri, Jul 13, 2007 at 03:33:30PM -0700, dormando wrote:
>  Also, fired up the fsck job yesterday :) 1 billion fids to do, current ETA 
>  is 6 months (down from 244 thousand years yesterday. It'll probably finish 
>  in a week). It's already found/fixed a ton of shit.
>       Status: 9133018 / 1035683850 (0.88%)
I've been wondering about ways to improve the estimation process (as
part of a cleanup of stats and fsck, planning for parallelization
stuff).

All of the following applies equally to a full check as a multiple
parallel checks of ranges.

Iff the fids are densely packed, then the estimation is very close to
correct. However, if the fids are only sparsely packed, then the
estimation tends to vary wildly. The fidsize checking against
mogstored's IS faster if a given range is densely packed.

Since it checks the fid space linearly, at any point we should know:
- lowestfid
- maxfid
- currentfid
- #checked 
- #total (expensive if SELECT COUNT is expensive on your DB).
- #remaining = #total - #checked

The present progress is simply currentfid/maxfid. This can cause wild
variations in the estimate, eg if the start of your fids are dense, and
sparse later, or vice versa.

Both maxfid and #total are moving targets.

Using #checked/#total gives better percentage completion, but as it
doesn't take into account the density, it can still lead to a bad time
estimate (since density affects the time estimate).

We have 3 measures of density:
- overall density: #total/(maxfid-lowestfid)
- previous density: #checked/(currentfid-lowestfid)
- remaining density: #remaining/(maxfid-currentfid)

The mean of previous density and remaining density should be the same as
overall density.

I haven't gotten further in planning how to get a better estimate out of
this yet, and it would be moot if the fsck design changed entirely (see
the various notes that suggest a file_to_fsck table).

-- 
Robin Hugh Johnson
Gentoo Linux Developer & Council Member
E-Mail     : robbat2 at gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 321 bytes
Desc: not available
Url : http://lists.danga.com/pipermail/mogilefs/attachments/20070713/ff11a81b/attachment.pgp


More information about the mogilefs mailing list