Waiting on replication

Andre' Hazelwood ahazelwood at ciradar.com
Wed Nov 14 15:04:12 UTC 2007


We are currently evaluating MFS as well, and noticed the same
functionality.  When I had mindevcount set to 1 mogtool would hang.
Once it got set to mindevcount of 2, the problem resolved itself.

-Andre'

-----Original Message-----
From: mogilefs-bounces at lists.danga.com
[mailto:mogilefs-bounces at lists.danga.com] On Behalf Of dormando
Sent: Wednesday, November 14, 2007 2:56 AM
To: Brian Easley
Cc: mogilefs at lists.danga.com
Subject: Re: Waiting on replication

I'm not equipped to check this out properly right now, but have you
attempted
to set the mindevcount to 2 and see if it hangs the same way?

On the surface it smells like a bug in mogtool, but I guess it could be
a few
things.

-Dormando

Brian Easley wrote:
> Hello,
> 
> I'd like to preface by saying that I've been reading the activity on
> this list for about a year, and would would like to thank everyone who
> shares information and provides support on this list.
> 
> It seems like I've got MFS working fairly well. I can get the service
to
> save a copy of a file, and we're successfully working on converting
our
> webapp's php filesystem calls to use a customized MFS API. I've
created
> a domain named test and a class named test that has a minimum
> replication count of 1.
> 
> I've got two trackers setup successfully and a single 13 disk
fileserver
> running mogstored. Do I need more then one fileserver for testing?
> 
> The problem comes in after the file is successfully stored for the
first
> copy. Here is what happens in command line:
> 
> # mogtool --trackers=10.10.10.6:6001,10.10.10.5:6001 --domain=test
> --bigfile inject "CentOS-5.0-i386-bin-1of6.iso" 9108 --concurrent=10
> -overwrite --class=test
> chunk 9108,1: 0650b369fe27eea4da74b91c9e7fa556, len = 67108864
> Spawned child 27202 to deal with chunk number 1.
> chunk 9108,2: 8467daa5e96d28f7fadbdd3509cc10fa, len = 67108864
> Spawned child 27203 to deal with chunk number 2.
>        chunk 1 saved in 1.03 seconds.complete]
> chunk 9108,3: fb056e133949fe862d774dbc5046000a, len = 67108864
> Child 27202 successfully finished with chunk 1.
> Spawned child 27204 to deal with chunk number 3.
>        chunk 2 saved in 0.90 seconds.complete]
> chunk 9108,4: 472bfb08dabccfe5774f97cc2c341cf9, len = 67108864
> Child 27203 successfully finished with chunk 2.
> Spawned child 27205 to deal with chunk number 4.
>        chunk 3 saved in 0.87 seconds.complete]
> chunk 9108,5: c3925144b01ea80338ac1f83a1a84cf5, len = 67108864
> Child 27204 successfully finished with chunk 3.
> Spawned child 27206 to deal with chunk number 5.
>        chunk 4 saved in 0.97 seconds.complete]
> chunk 9108,6: ddee6e9600cb5d50e467e7da38258e89, len = 67108864
> Child 27205 successfully finished with chunk 4.
> Spawned child 27220 to deal with chunk number 6.
>        chunk 5 saved in 1.03 seconds.complete]
> chunk 9108,7: 893be1558fd09898c42939b1b2b8f7aa, len = 67108864
> Child 27206 successfully finished with chunk 5.
> Spawned child 27221 to deal with chunk number 7.
>        chunk 6 saved in 1.08 seconds.complete]
> chunk 9108,8: 0d4785963b957a8b6522e6ca62a58cc3, len = 67108864
> Child 27220 successfully finished with chunk 6.
> Spawned child 27222 to deal with chunk number 8.
>        chunk 7 saved in 1.02 seconds.complete]
> chunk 9108,9: 160dbcec10500e1cefe355558ad24fdd, len = 67108864
> Child 27221 successfully finished with chunk 7.
> Spawned child 27223 to deal with chunk number 9.
> chunk 9108,10: fe4875c855d1e5553b08c643ac17f045, len = 52004864
> Spawned child 27224 to deal with chunk number 10.
>        chunk 8 saved in 1.77 seconds.
>        chunk 10 saved in 1.10 seconds.
> Child 27222 successfully finished with chunk 8.
>        chunk 9 saved in 1.67 seconds.
> Child 27224 successfully finished with chunk 10.
> Child 27223 successfully finished with chunk 9.
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> 
> [root at fs01 ~]# mogtool --trackers=10.10.10.6:6001,10.10.10.5:6001
> --domain=test --bigfile --concurrent=10 -overwrite --class=test inject
> "CentOS-5.0-i386-bin-1of6.iso" 9108 --concurrent=10 -overwrite
--class=test
> chunk 9108,1: 0650b369fe27eea4da74b91c9e7fa556, len = 67108864
> Spawned child 27266 to deal with chunk number 1.
> chunk 9108,2: 8467daa5e96d28f7fadbdd3509cc10fa, len = 67108864
> Spawned child 27267 to deal with chunk number 2.
>        chunk 1 saved in 1.02 seconds.complete]
> chunk 9108,3: fb056e133949fe862d774dbc5046000a, len = 67108864
> Child 27266 successfully finished with chunk 1.
> Spawned child 27268 to deal with chunk number 3.
>        chunk 2 saved in 0.91 seconds.complete]
> chunk 9108,4: 472bfb08dabccfe5774f97cc2c341cf9, len = 67108864
> Child 27267 successfully finished with chunk 2.
> Spawned child 27269 to deal with chunk number 4.
> chunk 9108,5: c3925144b01ea80338ac1f83a1a84cf5, len = 67108864
> Spawned child 27270 to deal with chunk number 5.
>        chunk 4 saved in 1.06 seconds.complete]
> chunk 9108,6: ddee6e9600cb5d50e467e7da38258e89, len = 67108864
> Child 27269 successfully finished with chunk 4.
> Spawned child 27271 to deal with chunk number 6.
>        chunk 3 saved in 2.90 seconds.complete]
> chunk 9108,7: 893be1558fd09898c42939b1b2b8f7aa, len = 67108864
>        chunk 5 saved in 1.65 seconds.
> Child 27268 successfully finished with chunk 3.
> Spawned child 27272 to deal with chunk number 7.
>        chunk 6 saved in 1.00 seconds.complete]
> chunk 9108,8: 0d4785963b957a8b6522e6ca62a58cc3, len = 67108864
> Child 27270 successfully finished with chunk 5.
> Child 27271 successfully finished with chunk 6.
>        chunk 7 saved in 0.79 seconds.
> Spawned child 27286 to deal with chunk number 8.
> chunk 9108,9: 160dbcec10500e1cefe355558ad24fdd, len = 67108864
> Child 27272 successfully finished with chunk 7.
>        chunk 8 saved in 0.83 seconds.
> Spawned child 27287 to deal with chunk number 9.
> chunk 9108,10: fe4875c855d1e5553b08c643ac17f045, len = 52004864
> Child 27286 successfully finished with chunk 8.
> Spawned child 27288 to deal with chunk number 10.
>        chunk 10 saved in 0.85 seconds.
>        chunk 9 saved in 1.70 seconds.
> Child 27288 successfully finished with chunk 10.
> Child 27287 successfully finished with chunk 9.
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> Beginning replication wait: 1 2 3 4 5 6 7 8 9 10
> 
> 
> 
> 
> The replication count continues eternally.
> 
> Two questions:
> 
> 1) Why is it trying to replicate when the class used has a max
> replication count of 1?
> 2) Why is it hanging on replication anyways?
> 
> Any help would be greatly appreciated.
> 
> -Brian





More information about the mogilefs mailing list