[PATCH 3/3] Open files only once

Michael Hanselmann public at hansmi.ch
Tue Jul 15 21:51:12 UTC 2008


Hi Brad

Unless noted otherwise I implemented your changes. See below for an
interdiff.

On Mon, Jul 14, 2008 at 05:55:50PM -0700, Brad Fitzpatrick wrote:
> > +sub _read_file {
> > +    our ($self) = @_;

> Wrong use of our here.  You want my.

Ups, must've been a leftover from a previous version of this patch.

>         $self->{chunks} = [];
> > -        return ();
> > +        return;
> >     }

> This function is called in list context.  I like to stay explicit about that
> in my return type.  I don't want to rely on implicit casts.

_read_file() isn't called in list context ("$self->_read_file;" in the
constructor). Where do you think is it called in list context?

> > +        push @chunks, new Brackup::PositionedChunk(
> >             file => $self,
> >             offset => $offset,
> >             length => $len,
> > +            chunkref => \$data,
> > +            digest => $chunk_digest,
> >         );

> Uh, you do this unconditionally?  Not dependent on the size of the file!?
> How am I going to backup a huge file on a low-memory box?

Yes and no. The chunker is responsible to return sensible values for
$len (see the other two patches). However, I just realized that this can
indeed be problematic for large files because everything is kept in
memory until the file is done.

One solution would be to basically stream everything, but that's not
such a simple change.

Do you have an idea?

Regards,
Michael

---
@@ -28,11 +28,11 @@ sub new {
 
     die "No root object provided." unless $self->{root} && $self->{root}->isa("Brackup::Root");
     die "No path provided." unless $self->{path};
     $self->{path} =~ s!^\./!!;
 
-    $self->_read_file();
+    $self->_read_file;
 
     return $self;
 }
 
 sub root {
@@ -123,14 +123,14 @@ sub chunks {
     my $self = shift;
     return @{$self->{chunks}};
 }
 
 sub _read_file {
-    our ($self) = @_;
+    my ($self) = @_;
 
     # non-files don't have chunks
-    unless ($self->is_file) {
+    if (not $self->is_file) {
         $self->{chunks} = [];
         return;
     }
 
     my $fullpath = $self->fullpath;
@@ -143,12 +143,12 @@ sub _read_file {
 
     my $size = $self->size;
     my $last_offset = undef;
     my $last_len = undef;
 
-    my $sha1_file = new Digest::SHA1;
-    my $sha1_chunk = new Digest::SHA1;
+    my $sha1_file = Digest::SHA1->new;
+    my $sha1_chunk = Digest::SHA1->new;
 
     my @chunks;
 
     $chunker->get_offsets($self, sub {
         my ($offset, $len) = @_;
@@ -188,15 +188,15 @@ sub _read_file {
 
         # Update digest for complete file
         $sha1_file->add($data);
 
         # Generate chunk digest
-        $sha1_chunk->reset();
+        $sha1_chunk->reset;
         $sha1_chunk->add($data);
-        my $chunk_digest = 'sha1:' . $sha1_chunk->hexdigest();
+        my $chunk_digest = 'sha1:' . $sha1_chunk->hexdigest;
 
-        push @chunks, new Brackup::PositionedChunk(
+        push @chunks, Brackup::PositionedChunk->new(
             file => $self,
             offset => $offset,
             length => $len,
             chunkref => \$data,
             digest => $chunk_digest,


More information about the brackup mailing list