Patch adds ranged data fetches to MogileFS::Client

Cahill, Earl ecahill at corp.untd.com
Fri Oct 27 07:21:13 UTC 2006


Anyone stream say, video in front of mogile?  Or how would you?

Thanks,
Earl

-----Original Message-----
From: mogilefs-bounces at lists.danga.com
[mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Arthur Bebak
Sent: Thursday, October 26, 2006 5:07 PM
To: mogilefs
Subject: Patch adds ranged data fetches to MogileFS::Client

All,

I've created a new method for MogileFS::Client which allows you
to fetch only portions of the file.

This is mighty useful if you're dealing with large files and don't
want to slurp the entire thing into memory all at once - something
I'm sure many Mogile users are concerned about.

The method is called get_file_data_range.

# Here's how to call it, pick one method.
# The range arg always overrides length and offset.
# If not provided, offset is assumed to be 0.
# All numbers are in units of bytes
#
%arg_hash = ( "range" => "1000-1100" );                      # bytes
1000-1100 inclusive
%arg_hash = ( "length" => "100", "offset" => "500" );        # bytes
500-599
%arg_hash = ( "length" => "300" );                           # bytes
0-299

$content_ref = $mogfs->get_file_data_range( $key, %arg_hash );

print "content = '$$content_ref'";

I'm basically constructing a Range: HTTP header to send to
the Mogile storage daemon. You can read about what you can
put in the "range" key here:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35


The patch below is to MogileFS-Client-1.03/lib/MogileFS/Client.pm

You can apply it like so:

Copy everything between the start/end lines to a patchfile,
then run these commands:

cd MogileFS-Client-1.03/lib/MogileFS
patch Client.pm < patchfile
perl Makefile.PL
make
make install


------------------ start patch ------------------
264a265,360
 > #
 > # given a key, returns a scalar reference pointing at a string
containing
 > # the contents of the file. takes two parameters; a scalar key to get
the
 > # data for the file, and a hash which can have one of several keys:
 > # %arg_hash = ( "timeout" => 10, "range" => "0-1024", "length" =>
"1000", "offset" => "100");
 > #
 > # See the definition of the HTTPD Range: header for details of what
 > # the "range" key can look like, but in general assuming a file of
size 10000
 > # you can do values like this:
 > #
 > # The first 500 bytes (byte offsets 0-499, inclusive):     "range" =>
"0-499"
 > # The second 500 bytes (byte offsets 500-999, inclusive):  "range" =>
"500-999"
 > # The final 500 bytes (byte offsets 9500-9999, inclusive): "range" =>
"-500"
 > #                                                          "range" =>
"9500-"
 > # The first and last bytes only (bytes 0 and 9999):        "range" =>
"0-0,-1"
 > #
 > # The other way to get a range is to give an offset into the file,
and
 > # specify the length. So for example, given "length" => 1000,
"offset" = 100,
 > # you'd get the equivelent of "range" => "100-1099". Note that the
 > # offset byte is included, so in general the formula is:
 > # $range = $offset . "-" . $length - 1;
 > #
 > # If offset is not given, then it is assumed that "offset" => 0. This
makes
 > # it easy to get the first $n bytes of the file:
 > # $n = 100;
 > # %arg_hash = ( "length" => $n );
 > # $content_ref = $mogfs->get_file_data_range( $key, %arg_hash );
 > #
 > # If the range key is defined, length/offset are ignored.
 > #
 > sub get_file_data_range {
 >     # given a key, load some paths and get data
 >     my MogileFS::Client $self = shift;
 >     my ($key, %arg_hash) = @_;
 >
 >     # Let's parse all the optional args
 >     my $timeout;
 >     if( exists $arg_hash{'timeout'} ) {
 >         $timeout = $arg_hash{'timeout'};
 >         } # if
 >
 >     my $range;
 >     if( exists $arg_hash{'range'} ) {
 >         $range = $arg_hash{'range'};
 >         } # if
 >
 >     my $offset;
 >     if( exists $arg_hash{'offset'} ) { $offset = $arg_hash{'offset'}
} # if
 >     else { $offset = "0"; }
 >
 >     my $length;
 >     if( exists $arg_hash{'length'} && ! exists $arg_hash{'range'} ) {
 >        my $num_bytes = $arg_hash{'length'} + $offset - 1;
 >        $range = $offset . "-" . $num_bytes;
 >        } # if
 >
 >     my @paths = $self->get_paths($key, 1);
 >     return undef unless @paths;
 >
 >     # iterate over each
 >     foreach my $path (@paths) {
 >         next unless defined $path;
 >         if ($path =~ m!^http://!) {
 >             # try via HTTP
 >             my $ua = new LWP::UserAgent;
 >             $ua->timeout($timeout || 10);
 >
 >             my $res;
 >             if(defined $range) {
 >                #
 >                # This will creata a request HTTPD header which looks
like this:
 >                # Range: bytes=$range
 >                #
 >                $res = $ua->get($path, "Range" => "bytes=$range" );
 >                 } # if
 >             else {
 >                $res = $ua->get($path);
 >                 } # else
 >
 >             if ($res->is_success) {
 >                 my $contents = $res->content;
 >                 return \$contents;
 >             }
 >
 >         } else {
 >             # open the file from disk and just grab it all
 >             open FILE, "<$path" or next;
 >             my $contents;
 >             { local $/ = undef; $contents = <FILE>; }
 >             close FILE;
 >             return \$contents if $contents;
 >         }
 >     }
 >     return undef;
 > }
 >
------------------ end patch --------------------

-- 
Arthur Bebak
abebak at fabrikinc.com


More information about the mogilefs mailing list