Patch adds ranged data fetches to MogileFS::Client

Brad Fitzpatrick brad at danga.com
Sun Nov 5 23:47:55 UTC 2006


I'd just throw Perlbal in front, using X-reproxy-url from the backend
process that does get_paths to the Mogile tracker.


On Fri, 27 Oct 2006, Cahill, Earl wrote:

> Anyone stream say, video in front of mogile?  Or how would you?
>
> Thanks,
> Earl
>
> -----Original Message-----
> From: mogilefs-bounces at lists.danga.com
> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Arthur Bebak
> Sent: Thursday, October 26, 2006 5:07 PM
> To: mogilefs
> Subject: Patch adds ranged data fetches to MogileFS::Client
>
> All,
>
> I've created a new method for MogileFS::Client which allows you
> to fetch only portions of the file.
>
> This is mighty useful if you're dealing with large files and don't
> want to slurp the entire thing into memory all at once - something
> I'm sure many Mogile users are concerned about.
>
> The method is called get_file_data_range.
>
> # Here's how to call it, pick one method.
> # The range arg always overrides length and offset.
> # If not provided, offset is assumed to be 0.
> # All numbers are in units of bytes
> #
> %arg_hash = ( "range" => "1000-1100" );                      # bytes
> 1000-1100 inclusive
> %arg_hash = ( "length" => "100", "offset" => "500" );        # bytes
> 500-599
> %arg_hash = ( "length" => "300" );                           # bytes
> 0-299
>
> $content_ref = $mogfs->get_file_data_range( $key, %arg_hash );
>
> print "content = '$$content_ref'";
>
> I'm basically constructing a Range: HTTP header to send to
> the Mogile storage daemon. You can read about what you can
> put in the "range" key here:
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35
>
>
> The patch below is to MogileFS-Client-1.03/lib/MogileFS/Client.pm
>
> You can apply it like so:
>
> Copy everything between the start/end lines to a patchfile,
> then run these commands:
>
> cd MogileFS-Client-1.03/lib/MogileFS
> patch Client.pm < patchfile
> perl Makefile.PL
> make
> make install
>
>
> ------------------ start patch ------------------
> 264a265,360
>  > #
>  > # given a key, returns a scalar reference pointing at a string
> containing
>  > # the contents of the file. takes two parameters; a scalar key to get
> the
>  > # data for the file, and a hash which can have one of several keys:
>  > # %arg_hash = ( "timeout" => 10, "range" => "0-1024", "length" =>
> "1000", "offset" => "100");
>  > #
>  > # See the definition of the HTTPD Range: header for details of what
>  > # the "range" key can look like, but in general assuming a file of
> size 10000
>  > # you can do values like this:
>  > #
>  > # The first 500 bytes (byte offsets 0-499, inclusive):     "range" =>
> "0-499"
>  > # The second 500 bytes (byte offsets 500-999, inclusive):  "range" =>
> "500-999"
>  > # The final 500 bytes (byte offsets 9500-9999, inclusive): "range" =>
> "-500"
>  > #                                                          "range" =>
> "9500-"
>  > # The first and last bytes only (bytes 0 and 9999):        "range" =>
> "0-0,-1"
>  > #
>  > # The other way to get a range is to give an offset into the file,
> and
>  > # specify the length. So for example, given "length" => 1000,
> "offset" = 100,
>  > # you'd get the equivelent of "range" => "100-1099". Note that the
>  > # offset byte is included, so in general the formula is:
>  > # $range = $offset . "-" . $length - 1;
>  > #
>  > # If offset is not given, then it is assumed that "offset" => 0. This
> makes
>  > # it easy to get the first $n bytes of the file:
>  > # $n = 100;
>  > # %arg_hash = ( "length" => $n );
>  > # $content_ref = $mogfs->get_file_data_range( $key, %arg_hash );
>  > #
>  > # If the range key is defined, length/offset are ignored.
>  > #
>  > sub get_file_data_range {
>  >     # given a key, load some paths and get data
>  >     my MogileFS::Client $self = shift;
>  >     my ($key, %arg_hash) = @_;
>  >
>  >     # Let's parse all the optional args
>  >     my $timeout;
>  >     if( exists $arg_hash{'timeout'} ) {
>  >         $timeout = $arg_hash{'timeout'};
>  >         } # if
>  >
>  >     my $range;
>  >     if( exists $arg_hash{'range'} ) {
>  >         $range = $arg_hash{'range'};
>  >         } # if
>  >
>  >     my $offset;
>  >     if( exists $arg_hash{'offset'} ) { $offset = $arg_hash{'offset'}
> } # if
>  >     else { $offset = "0"; }
>  >
>  >     my $length;
>  >     if( exists $arg_hash{'length'} && ! exists $arg_hash{'range'} ) {
>  >        my $num_bytes = $arg_hash{'length'} + $offset - 1;
>  >        $range = $offset . "-" . $num_bytes;
>  >        } # if
>  >
>  >     my @paths = $self->get_paths($key, 1);
>  >     return undef unless @paths;
>  >
>  >     # iterate over each
>  >     foreach my $path (@paths) {
>  >         next unless defined $path;
>  >         if ($path =~ m!^http://!) {
>  >             # try via HTTP
>  >             my $ua = new LWP::UserAgent;
>  >             $ua->timeout($timeout || 10);
>  >
>  >             my $res;
>  >             if(defined $range) {
>  >                #
>  >                # This will creata a request HTTPD header which looks
> like this:
>  >                # Range: bytes=$range
>  >                #
>  >                $res = $ua->get($path, "Range" => "bytes=$range" );
>  >                 } # if
>  >             else {
>  >                $res = $ua->get($path);
>  >                 } # else
>  >
>  >             if ($res->is_success) {
>  >                 my $contents = $res->content;
>  >                 return \$contents;
>  >             }
>  >
>  >         } else {
>  >             # open the file from disk and just grab it all
>  >             open FILE, "<$path" or next;
>  >             my $contents;
>  >             { local $/ = undef; $contents = <FILE>; }
>  >             close FILE;
>  >             return \$contents if $contents;
>  >         }
>  >     }
>  >     return undef;
>  > }
>  >
> ------------------ end patch --------------------
>
> --
> Arthur Bebak
> abebak at fabrikinc.com
>
>


More information about the mogilefs mailing list