do not have gaps in device ids

sanados sanados at failure.at
Mon Apr 21 12:02:05 UTC 2008


Just took a look at my syslog.... just meant as information for others 
that might run into the same troubles

It got spammed by those messages:

Apr 21 15:35:17 fsc2 mogilefsd[27626]: crash log: Device 7 doesn't 
exist.  at 
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm 
line 50
Apr 21 15:35:18 fsc2 mogilefsd[11811]: Child 27626 (replicate) died: 256 
(UNEXPECTED)
Apr 21 15:35:18 fsc2 mogilefsd[11811]: Job replicate has only 4, wants 
5, making 1.
Apr 21 15:35:22 fsc2 mogilefsd[27630]: crash log: Device 7 doesn't 
exist.  at 
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm 
line 50
Apr 21 15:35:23 fsc2 mogilefsd[11811]: Child 27630 (replicate) died: 256 
(UNEXPECTED)
Apr 21 15:35:23 fsc2 mogilefsd[11811]: Job replicate has only 4, wants 
5, making 1.
Apr 21 15:35:37 fsc2 mogilefsd[27636]: crash log: Device 7 doesn't 
exist.  at 
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm 
line 50
Apr 21 15:35:38 fsc2 mogilefsd[11811]: Child 27636 (replicate) died: 256 
(UNEXPECTED)
Apr 21 15:35:38 fsc2 mogilefsd[11811]: Job replicate has only 4, wants 
5, making 1.
Apr 21 15:35:42 fsc2 mogilefsd[27655]: crash log: Device 7 doesn't 
exist.  at 
/usr/local/share/perl/5.8.7/MogileFS/ReplicationPolicy/MultipleHosts.pm 
line 50
Apr 21 15:35:43 fsc2 mogilefsd[11811]: Child 27655 (replicate) died: 256 
(UNEXPECTED)
Apr 21 15:35:43 fsc2 mogilefsd[11811]: Job replicate has only 4, wants 
5, making 1.



device 7 is not in DB, nor is any file ( _* table) linked to this device.
device 7 was a spare hd (/dev/hdc1) with data on it ... i planned to 
insert it into mogile after i inserted the data on it into mogilefs 
(turned out to be a defect harddisk anyway)
(after assembling the computer it was hdc on the second comp in cluster 
so it was meant to be dev7)

but back to the errors:
after i added dev7 as dead device to host 2 all worked fine and no 
errors anymore.

another wired problem is that file_to_replicate had strange values.
it contained 90 rows with a nexttry value of 2147483647 (translates to 
Tuesday, January 19th 2038, 3:14:07 (GMT)).

No idea where that came from either.
after i changed nexttry to 0 the table got cleaned up.
(all files that were in there were already replicated often enough and 
so mogilefs just deleted that rows out of the db, guess that were tries 
to replicate to dev7 that never existed)


device 7 is still a noshow (though visible in db as dead device):
root at fsc2:~# mogadm --trackers=10.10.10.132:7001 device list
fsc1 [1]: alive
                   used(G) free(G) total(G)
  dev1: alive      14.662  222.814 237.477
  dev2: alive      17.594  257.479 275.072
  dev3: alive      14.987  168.396 183.384
  dev4: alive      41.199  142.185 183.384

fsc2 [2]: alive
                   used(G) free(G) total(G)
  dev5: alive      20.813  237.761 258.574
  dev6: alive      24.292  269.118 293.410
  dev8: alive      23.549  269.861 293.410




More information about the mogilefs mailing list