do not have gaps in device ids
sanados
sanados at failure.at
Mon Apr 21 12:02:05 UTC 2008
Just took a look at my syslog.... just meant as information for others
that might run into the same troubles
It got spammed by those messages:
Apr 21 15:35:17 fsc2 mogilefsd[27626]: crash log: Device 7 doesn't
exist. at
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm
line 50
Apr 21 15:35:18 fsc2 mogilefsd[11811]: Child 27626 (replicate) died: 256
(UNEXPECTED)
Apr 21 15:35:18 fsc2 mogilefsd[11811]: Job replicate has only 4, wants
5, making 1.
Apr 21 15:35:22 fsc2 mogilefsd[27630]: crash log: Device 7 doesn't
exist. at
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm
line 50
Apr 21 15:35:23 fsc2 mogilefsd[11811]: Child 27630 (replicate) died: 256
(UNEXPECTED)
Apr 21 15:35:23 fsc2 mogilefsd[11811]: Job replicate has only 4, wants
5, making 1.
Apr 21 15:35:37 fsc2 mogilefsd[27636]: crash log: Device 7 doesn't
exist. at
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm
line 50
Apr 21 15:35:38 fsc2 mogilefsd[11811]: Child 27636 (replicate) died: 256
(UNEXPECTED)
Apr 21 15:35:38 fsc2 mogilefsd[11811]: Job replicate has only 4, wants
5, making 1.
Apr 21 15:35:42 fsc2 mogilefsd[27655]: crash log: Device 7 doesn't
exist. at
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm
line 50
Apr 21 15:35:43 fsc2 mogilefsd[11811]: Child 27655 (replicate) died: 256
(UNEXPECTED)
Apr 21 15:35:43 fsc2 mogilefsd[11811]: Job replicate has only 4, wants
5, making 1.
device 7 is not in DB, nor is any file ( _* table) linked to this device.
device 7 was a spare hd (/dev/hdc1) with data on it ... i planned to
insert it into mogile after i inserted the data on it into mogilefs
(turned out to be a defect harddisk anyway)
(after assembling the computer it was hdc on the second comp in cluster
so it was meant to be dev7)
but back to the errors:
after i added dev7 as dead device to host 2 all worked fine and no
errors anymore.
another wired problem is that file_to_replicate had strange values.
it contained 90 rows with a nexttry value of 2147483647 (translates to
Tuesday, January 19th 2038, 3:14:07 (GMT)).
No idea where that came from either.
after i changed nexttry to 0 the table got cleaned up.
(all files that were in there were already replicated often enough and
so mogilefs just deleted that rows out of the db, guess that were tries
to replicate to dev7 that never existed)
device 7 is still a noshow (though visible in db as dead device):
root at fsc2:~# mogadm --trackers=10.10.10.132:7001 device list
fsc1 [1]: alive
used(G) free(G) total(G)
dev1: alive 14.662 222.814 237.477
dev2: alive 17.594 257.479 275.072
dev3: alive 14.987 168.396 183.384
dev4: alive 41.199 142.185 183.384
fsc2 [2]: alive
used(G) free(G) total(G)
dev5: alive 20.813 237.761 258.574
dev6: alive 24.292 269.118 293.410
dev8: alive 23.549 269.861 293.410
More information about the mogilefs
mailing list