bugs in PRINT/_write and do_request
Justin Azoff
JAzoff at uamail.albany.edu
Sat May 7 14:41:46 PDT 2005
I've been getting reconnections during writes working in the python
client. I noticed for the perl client if you do
----------------
use MogileFS;
my $mogfs = MogileFS->new(domain => 'test',
hosts => [ 'peter:7001' ]);
die "Unable to initialize MogileFS object.\n" unless $mogfs;
my $file_contents = $mogfs->get_file_data("motd");
print "$$file_contents\n";
sleep(10);
my $file_contents = $mogfs->get_file_data("motd");
print "$$file_contents\n";
--------------------
and during the sleep(10) restart the tracker, you get
MogileFS::Backend: socket closed on read at /usr/share/perl5/MogileFS.pm line 129
by the second call
this is because _get_sock never gets a chance to be called, I accounted
for this in the python client by having it retry the do_request once
which will then force _get_sock to be called.
next, for PRINT and _write there is:
# at this point, we had a socket error, since we have bytes left, and
# the loop above didn't finish sending them. if this was our first
# write, let's try to fall back to a different host.
unless ($self->{bytes_out}) {
if (my $dest = shift @{$self->{backup_dests}}) {
# dest is [$devid,$path]
$self->_parse_url($dest->[1]) or _fail("bogus URL");
$self->{devid} = $dest->[0];
$self->_connect_sock;
# now repass this write to try again
return $self->_write($data);
}
}
this has a few things wrong...
the check for bytes_out doesn't tell us if this is the first write or
not, I added a self._writecalls for that that needs to be set to 0 after
the PUT header as writing the PUT header calls self._write, but that
doesn't count.
It should be calling PRINT, not _write again.. _write doesn't send the
PUT header, so it will fall back to another host but get disconnected
when the perlbal is confused by the non-existant header.
and also, the conditions where a write can be retried also include
sending a whole chunk...
my code has:
if self.backup_dests and (self.content_length == 0 or self._writecalls == 1):
I also had to uncomment the call to _connect_sock since it gets called
for us by the PRINT function, all you have to do is set self->{sock} to
undef at the first error a few lines above.
--
-- Justin Azoff
-- Network Performance Analyst
More information about the mogilefs
mailing list