Patch and RFC: Perlbal OOM with buffered uploads and limited disk bandwidth. Also other upload weirdness

Tue May 15 19:31:29 UTC 2007

Interesting, but how is that not handled by this existing code in
ClientProxy.pm:

    # deal with chunked uploads
    if (my $cus = $self->{chunked_upload_state}) {
        $cus->on_readable($self);

        # if we got more than 1MB not flushed to disk,
        # stop reading for a bit until disk catches up
        if ($self->{read_ahead} > 1024*1024) {
            $self->watch_read(0);
        }
        return;
    }

That seems like it should do it already?

If not, could you write a failing test for this, by starting with the
existing buffered upload test, copying it to a new test, and making it
just do large buffered to disk with forced delays in the Perlbal
server at the right place (faking a slow disk), forcing the OOM?

In the server do something like:

  sleep 1 if $ENV{PERLBAL_T_SLOWDISK};

(or something) And in the test, set the environment before you start
the Perlbal server.

- Brad

On Tue, 15 May 2007, Jeremy James wrote:

> We've had problems recently with doing large (few hundred megs) POSTs to
> test servers on a fast network, but having perlbal buffer data onto slow
> disks - typically onto a NFS share.
>
> During such an upload, the read_ahead gets large (but not massive -
> typically around 60MB) and gives an 'Out of memory' error, rather than
> telling the socket to back off. There is, however, support for exactly
> this case with chunked uploads, so livejournal must have come across it
> before (about line 590 in ClientProxy.pm - in r617 by bradfitz)?
>
> Attached is a patch to call watch_read(0) if read_ahead gets above a Mb
> (and watch_read(1) if it drops back below). It's currently running much
> better without any crashes - at the worst case the user will see a
> paused upload for a few seconds before it continues.
>
> Any thoughts on this - Is watch_read actually an expensive function to
> call every time buffered_upload_update is? Are there any other issues
> with buffered uploads and slow disk bandwidth?
>
>
> With large files such as above, it can take some time to send the
> buffered upload to a backend apache server (again if the buffered data
> is coming back off NFS). In this time, the client often appears to
> either drop the connection and not ever show the results of the POST
> (and confirmation message from apache) or start the POST again from the
> beginning (especially noticible with the upload tracking - it uses the
> same upload_session).
>
> Is this a case where alive_time isn't being set for the client while
> data is being sent to the backend? Obviously this may be more of a
> theoretical situation than a practical one - we will be using local,
> fast disks and have more memory on our production machines - but it
> would be worth knowing if this may be an issue if uploads increase in size.
>
> I've had a poke around with updating the alive_time inside
> continue_buffered_upload, and watching when alive_time is updating in
> particular sockets, but results are inconsistent (sometimes perlbal is
> completely blocking, which makes getting results out of the management
> console awkward). Is there anything else that I should have a look at
> during the transfer to backend to aid in debugging this?
>
> Best wishes,
> Jeremy
>