mogilefs usage and performance

dormando dormando at
Fri Aug 11 07:13:51 UTC 2006

Just for clarification... All of your tests have the Mogile Tracker in 
there. Can you test using a local (in mod_perl) or external (memcached) 
for a path cache? Or have you already?

In our setup the typical flow is:

perlbal->php->memcached, then back up to perlbal->mogstored.

If the path is not in memcached (or has expired), then it does what you 

perlbal->php->tracker (then storing to memcached), hopping up to 
perlbal->mogstored for the rest.

I'd suggest a test where you cache paths locally in mod_perl, then try 
again with memcached if possible.

So it'd be perlbal->mod_perl->perlbal->mogstored, which is still going 
to be slow but should be much faster.

Also what is the concurrency you're testing against all this with? Given 
all of the network delays in serving up mogilefs files I've gotten more 
performance by widening the number of requests going in at once.

Finally, I don't know if this can be replicated elsewhere, but I've 
found perlbal to be really hard to benchmark. I did not have enough time 
to play around with it but full proxy benchmarks to limits dynamic 
apache pages would sometimes lock up entirely or drop to almost 
negligable rates. However in some other tests with perlbal load 
balancing two static apache servers, I was getting thousands of requests 
per second. Something to keep an eye out for I guess...


Anatoli Stoyanovski wrote:
> Dear readers,
> I'm testing mogilefs for website, which serves many pages and have
> complex technique of generating and caching content.
> I plan to store generated html files (actually, smth like
> SSI-templates) on mogilefs, replicating over a number of servers. I
> setup single test brand PIV-2.8Ghz server and run some benchmarks.
> The mogilefs setup is perlbal on the front, apache w/ mod_perl,
> mogstored/mogilefsd/mysql (http mode). SuSE Linux, default apache,
> default mysql, default distro setup.
> Perbal (80 port) proxies to apache/mod_perl (81 port), which contacts
> tracker (6001 port) via, then reproxy to mogstored(7500
> port), etc...
> I used 'ab' utility, apache benchmark, with 1000 requests by 10
> concurred connections for a single 10-kb image file in some
> configurations
> 1) Apache get: 1500 requests/sec
> 2) Perlbal->mod_perl->Tracker->Perlbal->mogstored: 190 r/s
> mogstored (direct download of known file url): 1170 r/s
> 3) Perlbal->mod_perl->Tracker: 140 r/s (don't use x-reproxy-url,
> return file contents via mod_perl)
> 4) mod_perl->Tracker->mogstored: 180 r/s (we probably don't need
> perlbal if mod_perl read file)
> 5) mod_perl->Tracker: 530 r/s (we don't get file content, just testing
> the tracker get_paths api)
> 6) local disk reads using perl: about 50 000 reads/seconds (disk cache, 
> etc)
> As we see, perlbal in front is an optimal configuration, but very slow
> compared with direct local disk access. That's bad.
> Mogstored (based on perlbal) is fast enough by itself. It's good.
> 5th special test just for information, as a part of mogilefs process.
> Overall, base configuration (2) have poor performance for us. Our
> index page contains about 50 template files (we don't discuss how to
> optimize it now), so I need 50 reads from mofilefs to assemble this
> page. It tooks about 0.4 seconds, and that's bad. Under heavy load it
> will be more slowly.
> So, I want to ask - what's the optimal environment for these tasks,
> where I need to read many small files very quickly and want to use
> replicating file system mogilefs? As I understand, I can scale
> trackers, mysql, setup more storage servers, but the minimum time for
> 50 files is 0.4 seconds.
> I've checked /var/mogdata and found that it contans regular files. It
> gives me draft idea of another method of using mogilefs: every server,
> which contains mogilefs replicas will try to read all files locally
> from /var/mogdata, and if it's not found, try to read it via
> get_file_data (should be tuned for real project). Does anybody use
> this method? I need as many as 500 10kbytes reads per second.
> Finally, I've tested 'gfarm' filesystem in the same environment. It
> gets 40 reads/second. Much less.

More information about the mogilefs mailing list