Patch adds ranged data fetches to MogileFS::Client
Brad Fitzpatrick
brad at danga.com
Sun Nov 5 23:47:55 UTC 2006
I'd just throw Perlbal in front, using X-reproxy-url from the backend
process that does get_paths to the Mogile tracker.
On Fri, 27 Oct 2006, Cahill, Earl wrote:
> Anyone stream say, video in front of mogile? Or how would you?
>
> Thanks,
> Earl
>
> -----Original Message-----
> From: mogilefs-bounces at lists.danga.com
> [mailto:mogilefs-bounces at lists.danga.com] On Behalf Of Arthur Bebak
> Sent: Thursday, October 26, 2006 5:07 PM
> To: mogilefs
> Subject: Patch adds ranged data fetches to MogileFS::Client
>
> All,
>
> I've created a new method for MogileFS::Client which allows you
> to fetch only portions of the file.
>
> This is mighty useful if you're dealing with large files and don't
> want to slurp the entire thing into memory all at once - something
> I'm sure many Mogile users are concerned about.
>
> The method is called get_file_data_range.
>
> # Here's how to call it, pick one method.
> # The range arg always overrides length and offset.
> # If not provided, offset is assumed to be 0.
> # All numbers are in units of bytes
> #
> %arg_hash = ( "range" => "1000-1100" ); # bytes
> 1000-1100 inclusive
> %arg_hash = ( "length" => "100", "offset" => "500" ); # bytes
> 500-599
> %arg_hash = ( "length" => "300" ); # bytes
> 0-299
>
> $content_ref = $mogfs->get_file_data_range( $key, %arg_hash );
>
> print "content = '$$content_ref'";
>
> I'm basically constructing a Range: HTTP header to send to
> the Mogile storage daemon. You can read about what you can
> put in the "range" key here:
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35
>
>
> The patch below is to MogileFS-Client-1.03/lib/MogileFS/Client.pm
>
> You can apply it like so:
>
> Copy everything between the start/end lines to a patchfile,
> then run these commands:
>
> cd MogileFS-Client-1.03/lib/MogileFS
> patch Client.pm < patchfile
> perl Makefile.PL
> make
> make install
>
>
> ------------------ start patch ------------------
> 264a265,360
> > #
> > # given a key, returns a scalar reference pointing at a string
> containing
> > # the contents of the file. takes two parameters; a scalar key to get
> the
> > # data for the file, and a hash which can have one of several keys:
> > # %arg_hash = ( "timeout" => 10, "range" => "0-1024", "length" =>
> "1000", "offset" => "100");
> > #
> > # See the definition of the HTTPD Range: header for details of what
> > # the "range" key can look like, but in general assuming a file of
> size 10000
> > # you can do values like this:
> > #
> > # The first 500 bytes (byte offsets 0-499, inclusive): "range" =>
> "0-499"
> > # The second 500 bytes (byte offsets 500-999, inclusive): "range" =>
> "500-999"
> > # The final 500 bytes (byte offsets 9500-9999, inclusive): "range" =>
> "-500"
> > # "range" =>
> "9500-"
> > # The first and last bytes only (bytes 0 and 9999): "range" =>
> "0-0,-1"
> > #
> > # The other way to get a range is to give an offset into the file,
> and
> > # specify the length. So for example, given "length" => 1000,
> "offset" = 100,
> > # you'd get the equivelent of "range" => "100-1099". Note that the
> > # offset byte is included, so in general the formula is:
> > # $range = $offset . "-" . $length - 1;
> > #
> > # If offset is not given, then it is assumed that "offset" => 0. This
> makes
> > # it easy to get the first $n bytes of the file:
> > # $n = 100;
> > # %arg_hash = ( "length" => $n );
> > # $content_ref = $mogfs->get_file_data_range( $key, %arg_hash );
> > #
> > # If the range key is defined, length/offset are ignored.
> > #
> > sub get_file_data_range {
> > # given a key, load some paths and get data
> > my MogileFS::Client $self = shift;
> > my ($key, %arg_hash) = @_;
> >
> > # Let's parse all the optional args
> > my $timeout;
> > if( exists $arg_hash{'timeout'} ) {
> > $timeout = $arg_hash{'timeout'};
> > } # if
> >
> > my $range;
> > if( exists $arg_hash{'range'} ) {
> > $range = $arg_hash{'range'};
> > } # if
> >
> > my $offset;
> > if( exists $arg_hash{'offset'} ) { $offset = $arg_hash{'offset'}
> } # if
> > else { $offset = "0"; }
> >
> > my $length;
> > if( exists $arg_hash{'length'} && ! exists $arg_hash{'range'} ) {
> > my $num_bytes = $arg_hash{'length'} + $offset - 1;
> > $range = $offset . "-" . $num_bytes;
> > } # if
> >
> > my @paths = $self->get_paths($key, 1);
> > return undef unless @paths;
> >
> > # iterate over each
> > foreach my $path (@paths) {
> > next unless defined $path;
> > if ($path =~ m!^http://!) {
> > # try via HTTP
> > my $ua = new LWP::UserAgent;
> > $ua->timeout($timeout || 10);
> >
> > my $res;
> > if(defined $range) {
> > #
> > # This will creata a request HTTPD header which looks
> like this:
> > # Range: bytes=$range
> > #
> > $res = $ua->get($path, "Range" => "bytes=$range" );
> > } # if
> > else {
> > $res = $ua->get($path);
> > } # else
> >
> > if ($res->is_success) {
> > my $contents = $res->content;
> > return \$contents;
> > }
> >
> > } else {
> > # open the file from disk and just grab it all
> > open FILE, "<$path" or next;
> > my $contents;
> > { local $/ = undef; $contents = <FILE>; }
> > close FILE;
> > return \$contents if $contents;
> > }
> > }
> > return undef;
> > }
> >
> ------------------ end patch --------------------
>
> --
> Arthur Bebak
> abebak at fabrikinc.com
>
>
More information about the mogilefs
mailing list