fsck hangs when a bad fid is encountered
dormando
dormando at rydia.net
Sat Sep 8 07:45:27 UTC 2007
Were you looking in syslog, and telnet'ing to a tracker and running
!watch? The latter's the most useful I think. Getting an error or two to
go on might give me some luck in figuring it out.
I'm not up to speed yet on the FSCK job (I'll have to be next week
though), so we'll have to hope one of the more in tune folks pipes up here..
-Dormando
Del Raco wrote:
> 13 devices (10 readonly, 3 alive), the 10 are readonly
> because they're about to be full, waiting on some new
> drives to arrive so the devices can be rebalanced.
>
> 4 trackers (1 fsck worker on each one).
>
> Didn't see any specific errors, but not sure if I were
> looking at all the right places. I manually deleted
> fid 1 and 2 from the file table and did a fsck reset
> after stopping each time. The output below are from
> running fsck for the third time. I waited around 30
> minutes after seeing the bad fid 678419 before
> stopping fsck.
>
> How long should I wait for SRCH to become GONE? Also,
> how do I go about finding how these fids became
> orphaned? Any way to prevent this in the future?
>
> Thanks.
>
> ========================================
>
> Output from "mogadm fsck status" (it's not currently
> running)
>
> Running: No
> Status: 669593 / 41601129 (1.61%)
> Time: 37m (297 fids/s; 2294m remain)
> Check Type: Normal (check policy + files)
>
> [num_NOPA]: 331
> [num_SRCH]: 331
>
> ========================================
>
> mysql> select fid, evcode, count(*) from fsck_log
> group by fid, evcode;
> +--------+--------+----------+
> | fid | evcode | count(*) |
> +--------+--------+----------+
> | 1 | NOPA | 119 |
> | 1 | SRCH | 119 |
> | 2 | NOPA | 147 |
> | 2 | SRCH | 147 |
> | 678419 | NOPA | 331 |
> | 678419 | SRCH | 331 |
> +--------+--------+----------+
> 6 rows in set (0.00 sec)
>
> ========================================
>
> mogadm fsck taillog
> unixtime event fid devid
> 1189153502 NOPA 678419 -
> 1189153502 SRCH 678419 -
> 1189153508 NOPA 678419 -
> 1189153508 SRCH 678419 -
> 1189153513 NOPA 678419 -
> 1189153513 SRCH 678419 -
> 1189153519 NOPA 678419 -
> 1189153519 SRCH 678419 -
> 1189153524 NOPA 678419 -
> 1189153524 SRCH 678419 -
> 1189153530 NOPA 678419 -
> 1189153530 SRCH 678419 -
> 1189153535 NOPA 678419 -
> 1189153535 SRCH 678419 -
> 1189153541 NOPA 678419 -
> 1189153541 SRCH 678419 -
> 1189153546 NOPA 678419 -
> 1189153546 SRCH 678419 -
> 1189153551 NOPA 678419 -
> 1189153551 SRCH 678419 -
>
> ========================================
>
> mysql> select min(utime), max(utime) from fsck_log
> where fid = 678419;
> +------------+------------+
> | min(utime) | max(utime) |
> +------------+------------+
> | 1189151747 | 1189153551 |
> +------------+------------+
> 1 row in set (0.00 sec)
>
More information about the mogilefs
mailing list