Data splitting

dormando dormando at rydia.net
Thu Aug 2 08:26:24 UTC 2007


Well. MogileFS is a bit more complicated of a setup than this, but it'd 
be an improvement.

I'd recommend setting it up on a box or two and seeing for yourself. 
With the built-in IO monitoring and the 'drain' feature, shuffling files 
around is a snap. I could go on about what specifically is better, but 
you should really dig in and see for yourself.

Everything overloaded? Just add more disks. New files will end up in the 
right place.
Stuff dying? Just add more. It'll deal.
Want to add features? Easy, it's just perl.

Blah blah. Easy.

-Dormando

drpr0ctologist at gmail.com wrote:
> I need help in clarifying exactly what MogileFS can do for me.  
> 
> Basically I have an image hosting site.  Currently I perform file 
> storage through NFS.  If someone can review what I have already below, 
> and let me know if my site is an ideal candidate for MogileFS, that 
> would be greatly appreciated!
> 
> Here's how it's done in detail:
> 
> Server WS = web server
> Server IMG1 = image server 1
> Server IMG2 = image server 2
> Server IMG3 = image server 3
> Server IMG4 = image server 4
> Server IMG5 = image server 5
> 
> Uploaded images are stored using NFS.  IMG1's image folder is mapped to 
> WS's /www/site/images/img1/.  Here's the chart:
> 
> Images for IMG1 are stored in /www/images/img1/ on WS.
> Images for IMG2 are stored in /www/images/img2/ on WS.
> Images for IMG3 are stored in /www/images/img3/ on WS.
> Images for IMG4 are stored in /www/images/img4/ on WS.
> Images for IMG5 are stored in /www/images/img5/ on WS.
> 
> Using the above strategy, my PHP application doesn't need to know that 
> the storage is off-server.  
> 
> To serve the images back to the user, instead of having it route from 
> Client > WS > IMG1 > WS . client, I'm letting clients access these IMG 
> servers directly, so that it becomes Client > IMG1 > Client.
> 
> This is done by giving each IMG server its own Apache and URl. Here's 
> the chart:
> 
> Server IMG1 = img1.mysite.com <http://img1.mysite.com/>
> Server IMG2 = img2.mysite.com <http://img2.mysite.com/>
> Server IMG3 = img3.mysite.com <http://img3.mysite.com/>
> Server IMG4 = img4.mysite.com <http://img4.mysite.com/>
> Server IMG5 = img5.mysite.com <http://img5.mysite.com/>
> 
> The problem is solved.  However there are some drawbacks:
> 
> When images are overloaded, it becomes really
> 
> I need help in clarifying exactly what MogileFS can do for me. 
> 
> Basically I have an image hosting site.  Currently I perform file 
> storage through NFS.  If someone can review what I have already below, 
> and let me know if my site is an ideal candidate for MogileFS, that 
> would be greatly appreciated!
> 
> Here's how it's done in detail:
> 
> Server WS = web server
> Server IMG1 = image server 1
> Server IMG2 = image server 2
> Server IMG3 = image server 3
> Server IMG4 = image server 4
> Server IMG5 = image server 5
> 
> Uploaded images are stored using NFS.  IMG1's image folder is mapped to 
> WS's /www/site/images/img1/.  Here's the chart:
> 
> Images for IMG1 are stored in /www/images/img1/ on WS.
> Images for IMG2 are stored in /www/images/img2/ on WS.
> Images for IMG3 are stored in /www/images/img3/ on WS.
> Images for IMG4 are stored in /www/images/img4/ on WS.
> Images for IMG5 are stored in /www/images/img5/ on WS.
> 
> Using the above strategy, my PHP application doesn't need to know that 
> the storage is off-server. 
> 
> To serve the images back to the user, instead of having it route from 
> Client > WS > IMG1 > WS . client, I'm letting clients access these IMG 
> servers directly, so that it becomes Client > IMG1 > Client.
> 
> This is done by giving each IMG server its own Apache and URl. Here's 
> the chart:
> 
> Server IMG1 = img1.mysite.com <http://img1.mysite.com/>
> Server IMG2 = img2.mysite.com <http://img2.mysite.com/>
> Server IMG3 = img3.mysite.com <http://img3.mysite.com/>
> Server IMG4 = img4.mysite.com <http://img4.mysite.com/>
> Server IMG5 = img5.mysite.com <http://img5.mysite.com/>
> 
> The problem is solved.  However there are some drawbacks:
> 
>     * If an IMG server becomes full and I need to split the data, it
>       becomes painful. Imagine having to copy arbitrary amounts of data
>       from IMG1, then splitting that to IMG1 and IMG6.  Then, you have
>       to update the database so it knows of the split.
>     * Mounting NFS volumes for every single web server is a hassle.
> 
> How can MogileFS help here?
> 



More information about the mogilefs mailing list