Another data point for high cpu usage on MogileFS SVN

Nathan Schmidt nschmidt at gmail.com
Mon Feb 12 00:05:04 UTC 2007


Brad,

I've run into what looks like the same high-cpu utilization case Carl  
Forsythe saw on the parent mogilefsd process. The other child  
processes look normal. This is on Debian Etch with a hand-rolled  
grsec kernel (hence the random pids). This effect was after having  
loaded 8 million files (~100Gb). Things ran fine for a many hours but  
then latched up. This machine was one of three running mogstored plus  
mogilefsd, and in retrospect this pid looks like the only tracker  
actually involved in the bulk load -- our php driver picks a tracker  
once and sticks with it through a job's lifetime -- reasonable for  
web pages, less so for command-line batch jobs.

I believe (but can't prove from my logs) that this happened while  
loading a 600+ Mb file - there may have been a timeout or out-of- 
memory involved somewhere. Hardware is two new, otherwise idle  
machines with Gig-E between them.

Regards,
-Nathan / PBwiki


hostname:~# uname -a
Linux hostname 2.6.19.1.pbwiki-grsec #1 SMP Mon Jan 22 21:02:01 PST  
2007 i686 GNU/Linux


IO::AIO.pm  v2.33
Danga::Sockey.pm v1.56


hostname:~/build/mogilefs/trunk/server# svn log|head -2
------------------------------------------------------------------------
r726 | hachi | 2007-02-01 19:42:25 +0000 (Thu, 01 Feb 2007) | 1 line


hostname:/# lsof | grep 17481

mogilefsd 17481 mogilefsd  cwd       DIR        8,1    4096          2 /
mogilefsd 17481 mogilefsd  rtd       DIR        8,1    4096          2 /
mogilefsd 17481 mogilefsd  txt       REG        8,1 1061700    
45367516 /usr/bin/perl
mogilefsd 17481 mogilefsd  mem       REG        0,0                   
0 [heap] (stat: No such file or directory)
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669183 /lib/tls/i686/cmov/libnss_files-2.3.6.so (path inode=6668312)
mogilefsd 17481 mogilefsd  mem       REG        8,1   78500    
45369246 /usr/lib/libz.so.1.2.3
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669180 /lib/tls/i686/cmov/libnsl-2.3.6.so (path inode=6668309)
mogilefsd 17481 mogilefsd  mem       REG        8,1            
45372306 /usr/lib/libmysqlclient.so.15.0.0 (path inode=45370715)
mogilefsd 17481 mogilefsd  mem       REG        8,1  110228    
45416897 /usr/lib/perl5/auto/DBD/mysql/mysql.so
mogilefsd 17481 mogilefsd  mem       REG        8,1  113404    
45416813 /usr/lib/perl5/auto/DBI/DBI.so
mogilefsd 17481 mogilefsd  mem       REG        8,1   22445    
45484893 /usr/local/lib/perl/5.8.8/auto/BSD/Resource/Resource.so
mogilefsd 17481 mogilefsd  mem       REG        8,1   36096    
45400136 /usr/lib/perl/5.8.8/auto/List/Util/Util.so
mogilefsd 17481 mogilefsd  mem       REG        8,1   18680    
45401924 /usr/lib/perl/5.8.8/auto/Time/HiRes/HiRes.so
mogilefsd 17481 mogilefsd  mem       REG        8,1  111304    
45400140 /usr/lib/perl/5.8.8/auto/POSIX/POSIX.so
mogilefsd 17481 mogilefsd  mem       REG        8,1   19764    
45400144 /usr/lib/perl/5.8.8/auto/Socket/Socket.so
mogilefsd 17481 mogilefsd  mem       REG        8,1   15640    
45400132 /usr/lib/perl/5.8.8/auto/IO/IO.so
mogilefsd 17481 mogilefsd  DEL       REG        8,1            
45481986 /usr/lib/locale/locale-archive
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669176 /lib/tls/i686/cmov/libcrypt-2.3.6.so (path inode=6668305)
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669174 /lib/tls/i686/cmov/libc-2.3.6.so (path inode=6668303)
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669170 /lib/tls/i686/cmov/libpthread-2.3.6.so (path inode=6668317)
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669178 /lib/tls/i686/cmov/libm-2.3.6.so (path inode=6668307)
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6669177 /lib/tls/i686/cmov/libdl-2.3.6.so (path inode=6668306)
mogilefsd 17481 mogilefsd  mem       REG        8,1   13744    
45401920 /usr/lib/perl/5.8.8/auto/Sys/Syslog/Syslog.so
mogilefsd 17481 mogilefsd  mem       REG        8,1             
6651919 /lib/ld-2.3.6.so (path inode=6652273)
mogilefsd 17481 mogilefsd    0u      CHR        1,3                
1085 /dev/null
mogilefsd 17481 mogilefsd    1u      CHR        1,3                
1085 /dev/null
mogilefsd 17481 mogilefsd    2u      CHR        1,3                
1085 /dev/null
mogilefsd 17481 mogilefsd    3r      REG        8,1    2131    
45384879 /usr/local/bin/mogilefsd
mogilefsd 17481 mogilefsd    4u     sock        0,4               
24273 can't identify protocol
mogilefsd 17481 mogilefsd    5u     unix 0xc797c580             
3649942 socket
mogilefsd 17481 mogilefsd    6u     IPv4      24276                 
TCP *:afs3-callback (LISTEN)
mogilefsd 17481 mogilefsd    7r     0000        0,9       0       
24277 eventpoll
mogilefsd 17481 mogilefsd    8u     unix 0xf761b900               
24278 socket
mogilefsd 17481 mogilefsd    9u     unix 0xc7271e40             
3649960 socket
mogilefsd 17481 mogilefsd   10u     unix 0xc7271ac0             
3649962 socket
mogilefsd 17481 mogilefsd   11u     unix 0xc7271740             
3649967 socket
mogilefsd 17481 mogilefsd   12u     unix 0xc72713c0             
3649969 socket
mogilefsd 17481 mogilefsd   13u     unix 0xc797c740             
3649943 socket
mogilefsd 17481 mogilefsd   14u     unix 0xc7271040             
3649974 socket
mogilefsd 17481 mogilefsd   15u     unix 0xcf9d2e40            
19259941 socket
mogilefsd 17481 mogilefsd   16u     unix 0xf7f72040            
19190880 socket
mogilefsd 17481 mogilefsd   17u     unix 0xf761b740            
19259913 socket
mogilefsd 17481 mogilefsd   18u     unix 0xe8eb5200            
19259906 socket
mogilefsd 17481 mogilefsd   19u     IPv4   19259882                 
TCP hostname:afs3-callback->foxtrot:56909 (ESTABLISHED)
mogilefsd 17481 mogilefsd   20u     unix 0xe7e12900            
17657376 socket
mogilefsd 17481 mogilefsd   21u     unix 0xf7f3b580            
19259924 socket
mogilefsd 17481 mogilefsd   22u     unix 0xf7554040            
19259911 socket
mogilefsd 17481 mogilefsd   23u     IPv4   19188088                 
TCP hostname:afs3-callback->foxtrot:24946 (ESTABLISHED)
mogilefsd 17481 mogilefsd   24u     unix 0xf4f3f040            
19259935 socket
mogilefsd 17481 mogilefsd   25u     unix 0xe8c7b040             
3651157 socket
mogilefsd 17481 mogilefsd   26u     unix 0xf319cc80               
24314 socket
mogilefsd 17481 mogilefsd   27u     unix 0xda0c1e40            
19259938 socket
mogilefsd 17481 mogilefsd   28u     IPv4   19257017                 
TCP hostname:afs3-callback->foxtrot:24955 (ESTABLISHED)
mogilefsd 17481 mogilefsd   29u     IPv4   17657247                 
TCP hostname:afs3-callback->foxtrot:56326 (ESTABLISHED)
mogilefsd 17481 mogilefsd   30u     IPv4   17657269                 
TCP hostname:afs3-callback->foxtrot:56329 (ESTABLISHED)
mogilefsd 17481 mogilefsd   33u     unix 0xf7f3b200            
19259921 socket


hostname:/# strace -tt -p 17481
Process 17481 attached - interrupt to quit
08:22:04.369315 time(NULL)              = 1171182124
08:22:04.369419 epoll_wait(7, {{EPOLLOUT, {u32=9, u64=9}}, {EPOLLOUT,  
{u32=11, u64=11}}, {EPOLLOUT, {u32=10, u64=10}}, {EPOLLOUT, {u32=12,  
u64=12}}, {EPOLLOUT, {u32=13, u64=13}}}, 1000, 250) = 5
08:22:04.369528 time(NULL)              = 1171182124
08:22:04.369566 epoll_wait(7, {{EPOLLOUT, {u32=13, u64=13}},  
{EPOLLOUT, {u32=12, u64=12}}, {EPOLLOUT, {u32=10, u64=10}},  
{EPOLLOUT, {u32=11, u64=11}}, {EPOLLOUT, {u32=9, u64=9}}}, 1000, 250)  
= 5
08:22:04.369663 time(NULL)              = 1171182124
08:22:04.369701 epoll_wait(7, {{EPOLLOUT, {u32=9, u64=9}}, {EPOLLOUT,  
{u32=11, u64=11}}, {EPOLLOUT, {u32=10, u64=10}}, {EPOLLOUT, {u32=12,  
u64=12}}, {EPOLLOUT, {u32=13, u64=13}}}, 1000, 250) = 5
08:22:04.369797 time(NULL)              = 1171182124
08:22:04.369833 epoll_wait(7, {{EPOLLOUT, {u32=13, u64=13}},  
{EPOLLOUT, {u32=12, u64=12}}, {EPOLLOUT, {u32=10, u64=10}},  
{EPOLLOUT, {u32=11, u64=11}}, {EPOLLOUT, {u32=9, u64=9}}}, 1000, 250)  
= 5
08:22:04.369929 time(NULL)              = 1171182124
08:22:04.369967 epoll_wait(7, {{EPOLLOUT, {u32=9, u64=9}}, {EPOLLOUT,  
{u32=11, u64=11}}, {EPOLLOUT, {u32=10, u64=10}}, {EPOLLOUT, {u32=12,  
u64=12}}, {EPOLLOUT, {u32=13, u64=13}}}, 1000, 250) = 5
08:22:04.370067 time(NULL)              = 1171182124
08:22:04.370104 epoll_wait(7, {{EPOLLOUT, {u32=13, u64=13}},  
{EPOLLOUT, {u32=12, u64=12}}, {EPOLLOUT, {u32=10, u64=10}},  
{EPOLLOUT, {u32=11, u64=11}}, {EPOLLOUT, {u32=9, u64=9}}}, 1000, 250)  
= 5
08:22:04.370201 time(NULL)              = 1171182124
08:22:04.370238 epoll_wait(7, {{EPOLLOUT, {u32=9, u64=9}}, {EPOLLOUT,  
{u32=11, u64=11}}, {EPOLLOUT, {u32=10, u64=10}}, {EPOLLOUT, {u32=12,  
u64=12}}, {EPOLLOUT, {u32=13, u64=13}}}, 1000, 250) = 5
08:22:04.370334 time(NULL)              = 1171182124


hostname:/# top

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
17481 mogilefs  25   0 17264  11m 1428 R  100  0.2   1889:40 /usr/bin/ 
perl /usr/local/bin/mogilefsd --daemonize
2703 mogilefs  15   0 18368  13m 1888 S    0  0.2 137:15.46 /usr/ 
local/bin/mogilefsd [replicate]
6173 mogilefs  15   0 17264  12m 1896 S    0  0.2  14:31.54 /usr/ 
local/bin/mogilefsd [queryworker]
21573 mogilefs  15   0 17264  12m 1900 S    0  0.2  26:35.22 /usr/ 
local/bin/mogilefsd [queryworker]
23070 mogilefs  15   0 15196   9m 1584 S    0  0.2   1:40.66 /usr/ 
local/bin/mogilefsd [reaper]
2141 mogilefs  15   0 18844  13m 1888 S    0  0.2  35:48.03 /usr/ 
local/bin/mogilefsd [replicate]
2602 mogilefs  15   0 17264  12m 1896 S    0  0.2  17:51.28 /usr/ 
local/bin/mogilefsd [queryworker]
3713 mogilefs  15   0 17264  12m 1896 S    0  0.2  23:40.23 /usr/ 
local/bin/mogilefsd [queryworker]
4301 mogilefs  15   0 17264  12m 1896 S    0  0.2  23:39.64 /usr/ 
local/bin/mogilefsd [queryworker]
5201 mogilefs  15   0 17264  12m 1900 S    0  0.2  17:47.23 /usr/ 
local/bin/mogilefsd [queryworker]
...
15796 root      15   0 85268  80m 1440 S    0  1.3 208:18.32 /usr/bin/ 
perl /usr/local/bin/mogstored --daemonize
8048 root      18   0 11052 7152  684 S    0  0.1   0:01.97 mogstored  
[iostat]
7445 root      18   0 11052 7216  752 S    0  0.1   0:00.44 mogstored  
[diskusage]




hostname:~/build/mogilefs/trunk/server# opreport
CPU: Core 2, speed 2128.14 MHz (estimated)
	Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with  
a unit mask of 0x00 (Unhalted core cycles) count 100000
CPU_CLK_UNHALT...|
   samples|      %|
------------------
2965751982 86.1908 perl
         CPU_CLK_UNHALT...|
           samples|      %|
         ------------------
         2956591714 99.6911 perl
           6625653  0.2234 anon (tgid:17481 range:0xb7eec000-0xb7eed000)
            571737  0.0193 anon (tgid:15796 range:0xb7f7b000-0xb7f7c000)
            387798  0.0131 anon (tgid:2703 range:0xb7eec000-0xb7eed000)
            184558  0.0062 anon (tgid:21288 range:0xb7eec000-0xb7eed000)
            144528  0.0049 anon (tgid:9084 range:0xb7eec000-0xb7eed000)
            131671  0.0044 anon (tgid:22171 range:0xb7eec000-0xb7eed000)
            128116  0.0043 anon (tgid:21573 range:0xb7eec000-0xb7eed000)
            127853  0.0043 anon (tgid:2141 range:0xb7eec000-0xb7eed000)
            109357  0.0037 anon (tgid:9918 range:0xb7eec000-0xb7eed000)
            109346  0.0037 anon (tgid:12743 range:0xb7eec000-0xb7eed000)
            108362  0.0037 anon (tgid:3713 range:0xb7eec000-0xb7eed000)
            108204  0.0036 anon (tgid:4301 range:0xb7eec000-0xb7eed000)
             74108  0.0025 anon (tgid:21473 range:0xb7eec000-0xb7eed000)
             69463  0.0023 anon (tgid:5201 range:0xb7eec000-0xb7eed000)
             69030  0.0023 anon (tgid:2602 range:0xb7eec000-0xb7eed000)
             67257  0.0023 anon (tgid:31006 range:0xb7eec000-0xb7eed000)
             45047  0.0015 anon (tgid:6173 range:0xb7eec000-0xb7eed000)
             15858 5.3e-04 anon (tgid:31730 range:0xb7eec000-0xb7eed000)
              9685 3.3e-04 anon (tgid:6626 range:0xb7eec000-0xb7eed000)
              8387 2.8e-04 anon (tgid:14443 range:0xb7eec000-0xb7eed000)
              8178 2.8e-04 anon (tgid:1313 range:0xb7eec000-0xb7eed000)
              6945 2.3e-04 anon (tgid:22540 range:0xb7eec000-0xb7eed000)
              6196 2.1e-04 anon (tgid:19615 range:0xb7eec000-0xb7eed000)
              6096 2.1e-04 anon (tgid:23070 range:0xb7eec000-0xb7eed000)
              4441 1.5e-04 anon (tgid:10476 range:0xb7eec000-0xb7eed000)
              3400 1.1e-04 anon (tgid:24446 range:0xb7eec000-0xb7eed000)
              3269 1.1e-04 anon (tgid:4299 range:0xb7eec000-0xb7eed000)
              2711 9.1e-05 anon (tgid:1474 range:0xb7eec000-0xb7eed000)
              2596 8.8e-05 anon (tgid:16955 range:0xb7eec000-0xb7eed000)
              2513 8.5e-05 anon (tgid:25095 range:0xb7eec000-0xb7eed000)
              2501 8.4e-05 anon (tgid:26558 range:0xb7eec000-0xb7eed000)
              2469 8.3e-05 anon (tgid:10617 range:0xb7eec000-0xb7eed000)
              2400 8.1e-05 anon (tgid:16616 range:0xb7eec000-0xb7eed000)
              2239 7.5e-05 anon (tgid:11412 range:0xb7eec000-0xb7eed000)
              2204 7.4e-05 anon (tgid:8048 range:0xb7f7b000-0xb7f7c000)
              2201 7.4e-05 anon (tgid:32125 range:0xb7eec000-0xb7eed000)
              1330 4.5e-05 anon (tgid:10899 range:0xb7eec000-0xb7eed000)
              1322 4.5e-05 anon (tgid:16287 range:0xb7eec000-0xb7eed000)
               353 1.2e-05 anon (tgid:19203 range:0xb7eec000-0xb7eed000)
               336 1.1e-05 anon (tgid:23388 range:0xb7eec000-0xb7eed000)
               321 1.1e-05 anon (tgid:7445 range:0xb7f7b000-0xb7f7c000)
               127 4.3e-06 anon (tgid:24046 range:0xb7eec000-0xb7eed000)
                91 3.1e-06 anon (tgid:11864 range:0xb7eec000-0xb7eed000)
                 6 2.0e-07 anon (tgid:7592 range:0xb7eec000-0xb7eed000)
                 3 1.0e-07 anon (tgid:28314 range:0xb7eec000-0xb7eed000)
                 2 6.7e-08 anon (tgid:7840 range:0xb7eec000-0xb7eed000)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/mogilefs/attachments/20070211/87aa01ad/attachment-0001.htm


More information about the mogilefs mailing list