Parallel injects

dormando dormando at rydia.net
Tue Mar 18 07:51:29 UTC 2008


Hey,

Try writing a short perl script to test parallel injections with.
mogtool will wait in a loop until a file has been replicated "enough" -
which test setups often don't work so well with.

You can try lowering the mindevcount for your test class, adjust options
sent to mogtool (it has an option to not wait for replication, I'm
pretty sure?).

-Dormando

Michael A. Toth wrote:
> Hi there,
> 
> (Sorry for my bad English.) I built a testing environment for
> mogilefs, I created a master-slave replicated MySQL-5 databases, two
> trackers, four hosts (mogstored) and four devices too, and mogilefs
> compiled from svn r1156. I would like to test parallel file injects,
> so i wrote a very simple shell script.
> 
> #!/bin/bash
> 
> n=0
> 
> while [ 1 ]; do
> 
>   k=$(ps aux | grep '/usr/bin/mogtool'|wc -l)
> 
>   [ $k -ge 5 ] && continue
> 
>   for j in `seq 1 $((5 - $k))`; do
>     n=$((n + 1))
>     mogtool inject /testfile.txt test-$n.txt >/dev/null 2>&1 &
>   done
> 
> done
> 
> This executes always min. five inject processes. When I run this
> script, and watch process list, approximately 30-35 seconds I see 3 or
> 4 zombie mogtool processes; and the other procs will never stops. I
> checked these procs via strace, and I saw:
> 
> 7096  --- SIGCHLD (Child exited) @ 0 (0) ---
> 7096  select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
> 7096  waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 29678
> 7096  send(4, "get_paths
> domain=testdomain&noverify=1&key=test-16.txt\r\n", 58, MSG_NOSIGNAL) =
> 58
> 7096  select(8, [4], NULL, NULL, {3, 0}) = 1 (in [4], left {2, 997000})
> 7096  read(4, "ERR unknown_key unknown_key\r\n", 4096) = 29
> 7096  write(1, "Error: reaped child 29678 for chunk 1 but no paths
> exist... Retrying...\n", 72) = 72
> 7096  clone(child_stack=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0xb7d87928) = 29685
> 7096  write(1, "Spawned child 29685 to deal with chunk number 1.\n", 49) = 49
> 7096  waitpid(-1, 0xbfe23d58, WNOHANG)  = 0
> 7096  select(0, NULL, NULL, NULL, {0, 100000} <unfinished ...>
> 
> (this happens to again for ever, or until I kill)
> 
> The machines are Ubuntu Dapper i386 with own kernel-2.6.23, all
> depends of mogilefs Perl modules built from cpan, and I prefer Perlbal
> for trackers.
> 
> Have you got any idea? Please let me if you need more describe.
> 
> Regards,
> Michael A. Toth



More information about the mogilefs mailing list