Poor web serving performance

Cal Henderson calh at yahoo-inc.com
Thu Feb 1 20:49:58 UTC 2007


Brad Fitzpatrick <brad at danga.com> wrote:
: And the IO::AIO test suite passes?  (I assume, since you said the
: Linux::AIO one failed...)

yes - passes 100%


: Definitely looks like it's not working.  I wonder if the tests 
aren't
: aggressive enough... can you tell if they exercise the set_minimum()
: function at all and verify somehow that it works?  (counting
: threads/processes, or testing max outstanding AIOs... though the
: latter seems hard without being able to fake and slow down the
: filesystem...)

it uses it, but doesn't check for threads actually being spawned. when 
i run perlbal using linuxthreads i definately see multiple threads 
(even if not the correct count). the IO::AIO notes indicate that it no 
longer spawns the minimum number of threads at startup:

  IO::AIO starts threads only on demand, when an AIO request
  is queued and no free thread exists. Please note that
  queueing up a hundred requests can create demand for a
  hundred threads, even if it turns out that everything is in
  the cache and could have been processed faster by a single
  thread.

but that shouldn't mean that it never reaches its peak 'min' number. 
not sure what's up with that.


using aio channels tells me that i definately have multiple tasks 
queued up:

aio
chan-filer6 ctr_queued 0
chan-filer6 cur_running 312
chan-filer6 ctr_total 5554
chan-filer6 cur_queued 0
chan-filer7 ctr_queued 0
chan-filer7 cur_running 185
chan-filer7 ctr_total 3530
chan-filer7 cur_queued 0
.

but that doesn't say how many of them are being executed in parallel - 
just that alot are queued. i'm probably being dense, but i can't 
figure out a reasonable test to see if async stuff is really happening 
in parallel.

argh - i just noticed that perlbal is spiking to 99.9% CPU when i 
throw this traffic at it. is that to be expected if it's spinning 
waiting for aio requests to complete, or does that indicate that i'm 
spending all my time in aio callbacks? (running a plain vanilla 
web_server backend instead of my custom one gives the same results - 
so it doesn't appear to be time spent inside my plugin).

some strace's for comparisons - when i overload perlbal and requests 
start failing:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 39.91    0.638100          17     37852           futex
 16.59    0.265201           7     38693           epoll_ctl
  6.11    0.097642           9     11453      1753 close
  5.04    0.080565           4     19610           fcntl64
  4.44    0.070971          10      7385      1258 read


and when i send a lower request rate and nothing fails:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.33    1.041618          83     12622           epoll_wait
 21.76    0.450431          21     21690           futex
  5.04    0.104227           4     29300           time
  3.94    0.081629           4     18520           epoll_ctl
  3.45    0.071367          22      3232           sendfile


--cal 



More information about the perlbal mailing list