<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7226.0">
<TITLE>RE: Newbe - hardware recommendations</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>On Wed, Jan 04, 2006 at 01:47:46AM +0100, jb wrote:<BR>
<BR>
> Hello and happy new year;<BR>
><BR>
> We plan to use MogileFS for an online storage service. In fact, we look<BR>
> since several month a solution to scale our storage server, and I think it's<BR>
> time to make the right decision ;-)<BR>
><BR>
> Today :<BR>
> Docs are stored on a 1U server with Raid 5 and 600 Go.<BR>
> 300 Go - 4,5 millions of files. (99% are pictures, 1% are doc, pdf, small<BR>
> audio files..)<BR>
> During peak hours, we have 280 requests / sec on this server.<BR>
> On a basic day we serve 8,5 millions of requests.<BR>
><BR>
> -> This server is really a bottleneck, and we see it really near its limits.<BR>
<BR>
What server software are you using? apache? lighttpd?<BR>
<BR>
> Ideally, we'd like to perform at least 600 requests / sec with our future<BR>
> MogileFS installation.<BR>
><BR>
> We plan to use this configuration :<BR>
><BR>
> - 2 storage Node : 2 x 2U with for each 6 HD ( 2x 80 Go for Debian in raid<BR>
> 1, and 4x 400 Go for storage with no raid)<BR>
><BR>
> - 2 trackers : 2 x 1U (with Perl client on it too)<BR>
><BR>
> - 1 mySQL : 1 x 1U with 3 Go of RAM<BR>
><BR>
> So my questions :<BR>
<BR>
> 1. Do you think this configuration is sufficient to handle this trafic. I'd<BR>
> like to have your experience and maybe some recommendations.<BR>
<BR>
got me, I'm sure Brad would know though :-)<BR>
<BR>
> 2. I think MySQL can become the new bottleneck, do you have some tips based<BR>
> on your experience with MogileFS for the hardware configuration of this<BR>
> server.<BR>
<BR>
Maybe not the bottleneck, but it becomes a SPOF unless you do regular backups/replication.<BR>
<BR>
> 3. Does PerlBal can cache documents (like Squid for example), or is it a<BR>
> good practice to imagine to cache documents on the tracker ?<BR>
<BR>
Perbal doesn't need to cache documents.. it runs on the same machine that holds the files.<BR>
mogilefs works like:<BR>
<BR>
1) client -> tracker : I want to read foo.jpg<BR>
2) tracker -> mysql : give me the record for foo.jpg<BR>
3) mysql -> tracker -> client: that is file 00001, on x.x.x.2 or x.x.x.3<BR>
4) client -> x.x.x.2(perlbal): hey, give me 00001 ( GET /......./000001 )<BR>
<BR>
What I believe livejournal does is use memcache to cache steps 2 and 3, so repeated hits for the same file never touch the tracker and mysql server.<BR>
<BR>
Once you are at step 4, you simply have one or more URLS to your file and it is just plain old HTTP from that point on.<BR>
<BR>
If you read some of the pdfs on <A HREF="http://danga.com/words/">http://danga.com/words/</A> they might give you some ideas.<BR>
<BR>
> Thanks for your answers.<BR>
<BR>
> JB<BR>
<BR>
--<BR>
- Justin</FONT>
</P>
</BODY>
</HTML>