mogilefs usage and performance

Brad Fitzpatrick brad at danga.com
Sun Aug 20 21:20:13 UTC 2006


Possibly of interest:

Perlbal, as of Friday, can now cache reproxy URL addresses internally (not
using memcached, but optional memcached support will be next), and your
application can reply with how long the cache should persist for.
Unfortunately, there's no way to invalidate the cache, so it's only useful
for immutable URLs (users can replace a filename with a different file).

If you could, it'd be interesting for you to rerun your numbers with that
new feature and let us know how it goes.

You'll have to return a new header from mod_perl like:

X-Reproxy-Cache-For: 3600; Content-Type Last-Modified

Which says "cache for 3600 seconds, and cache the following headers as
well". (Content-Length will be filled in my the mogstored).

Also, have you tried using lighttpd instead of mogstored?  At least for
GETs.  With the svn Mogile you can even use lighttpd for the PUTs/DELETEs.

- Brad


On Mon, 14 Aug 2006, Anatoli Stoyanovski wrote:

> > > If configured with 'memcached before get_paths':
> > >
> > > perlbal->mod_perl->memcached->perlbal->mogstored: 250req/sec
> > >
> > > mod_perl->memcached (no real file contents, just get x-reproxy): 1500
> > > req/sec (memcached works fast, not an invention)
> > >
> > > I dont want to store actual file contents in memcached, so didn't test it.
> > > So, (1500req/sec of mod_perl/memcached + 1170 req/sec mogstored) via
> > > perlbal = 250req/sec overall performance for 10kb file downloading in
> > > 10 concurrent connections.
> >
> > Paths don't usually change that often, so I just give them a five minute
> > timeout. Very occasionally we'd see a blip, but it would correct itself.
> > Also if you use perlbal's ability to X-Reproxy to a list of paths it'll
> > almost always work even if one of the paths dies.
>
> The problem is images, that users replaces via web-interface. If user
> replaces images, we don't change url for this (not should be changed).
> mogilefs will give new .fid url for any these images, but url
> /images/somename.jpg will give 404 error for 5 minutes if we delete
> old image from mogilefs, all site users will see it. But we can leave
> it there, not worrying about image garbage, but it will confuse
> publisher for up to 5 minutes, which isn't perfect too. So, the cache
> timeout should be very small.
>
> > So what you're saying is that:
> >
> > Just directly from mod_perl->memcached to fetch a path but not actually
> > reproxying runs at 1500 requests/sec?
>
> Yes
>
> >
> > How're you getting the requests/sec to mogstored? Directly querying it?
>
> Yes. mod_perl on 81th port, just doing
> $r->header_out('X-Reproxy-Url'=>$paths[0]);
>
> In test conditions, it always returns constant url, which I use for
> next benchmark 'ab' call.
>
> > Then putting the whole chain together ends up with 250req/sec? Are you
> > monitoring the box during your 'ab' tests and seeing if any one
> > component starts nailing IO, page thrashing, or maxing out CPU?
>
> Well... after 50 000 requests the system state (by top):
> Cpu(s): 75.7% us,  8.9% sy,  0.0% ni, 11.9% id
> load average: 2.05, 1.01, 0.44
>
> perlbal 93% CPU
> mogilefsd 17% (if memcached switched off)
> apache/mod_perl: 5 processes with 7% per each
> nginx (replacing mogstored in GETs): <5%
>
> mysql runs other server.
>
> no any disk activity since all logging switched off, the file is in RAM cache.
>
> > Also, is your perlbal -> mod_perl step using backend keepalives? Not
> > having that working adds a lot of overhead to perlbal.
>
>   SET persist_client  = on
>   SET persist_backend = on
>   SET verify_backend  = off
>   SET enable_reproxy  = true
>
> I switched verify_backed off for test speed.
>
> Perlbal was tuned as Mark Smith suggests, but it still eats CPU.
> Maybe, I should search another web-accelerator software with
> reproxy-like feature.
>
>


More information about the mogilefs mailing list