bugs in PRINT/_write and do_request

Justin Azoff JAzoff at uamail.albany.edu
Sat May 7 14:41:46 PDT 2005


I've been getting reconnections during writes working in the python
client.  I noticed for the perl client if you do
----------------
use MogileFS;
my $mogfs = MogileFS->new(domain => 'test',
                          hosts  => [ 'peter:7001' ]);
die "Unable to initialize MogileFS object.\n" unless $mogfs;


my $file_contents = $mogfs->get_file_data("motd");
print "$$file_contents\n";

sleep(10);
my $file_contents = $mogfs->get_file_data("motd");
print "$$file_contents\n";
--------------------
and during the sleep(10) restart the tracker, you get

MogileFS::Backend: socket closed on read at /usr/share/perl5/MogileFS.pm line 129

by the second call

this is because _get_sock never gets a chance to be called, I accounted
for this in the python client by having it retry the do_request once
which will then force _get_sock to be called.


next, for PRINT and _write there is:

    # at this point, we had a socket error, since we have bytes left, and
    # the loop above didn't finish sending them.  if this was our first
    # write, let's try to fall back to a different host.
    unless ($self->{bytes_out}) {
        if (my $dest = shift @{$self->{backup_dests}}) {
            # dest is [$devid,$path]
            $self->_parse_url($dest->[1]) or _fail("bogus URL");
            $self->{devid} = $dest->[0];
            $self->_connect_sock;

            # now repass this write to try again
            return $self->_write($data);
        }
    }

this has a few things wrong...
the check for bytes_out doesn't tell us if this is the first write or
not, I added a self._writecalls for that that needs to be set to 0 after
the PUT header as writing the PUT header calls self._write, but that
doesn't count.

It should be calling PRINT, not _write again.. _write doesn't send the
PUT header, so it will fall back to another host but get disconnected
when the perlbal is confused by the non-existant header.

and also, the conditions where a write can be retried also include
sending a whole chunk...
my code has:
if self.backup_dests and (self.content_length == 0 or self._writecalls == 1):

I also had to uncomment the call to _connect_sock since it gets called
for us by the PRINT function, all you have to do is set self->{sock} to
undef at the first error a few lines above.




-- 
-- Justin Azoff
-- Network Performance Analyst



More information about the mogilefs mailing list