Distributing MogileFS' Database

Sat Sep 23 07:19:00 UTC 2006

Hey all,

My girlfriend ditched me on a friday night, so naturally I'm bored and 
thinking about MogileFS. (It's more interesting than fixing my six year 
old perl scripts).

There was very brief talk at the summit about how to make MogileFS's 
database store more distributed. I was curious if the ML could flesh out 
some ideas a bit more? The reason I'd bring it up now is during all the 
minor hacking/documentation, we can spot areas that need adjustment and 
start working towards the eventual goal.

I know Brad has/had some ideas of how it should work ("As easy as 
possible" for starters), plus there are issues of redundancy in the dataset.

Obviously there are some easy parts and some hard parts, but I'm an 
idiot so it's all probably easy. Distributing the device table, class 
table, domain table, are probably hard. Distributing anything related to 
files relative to a domain is easy. A domain could automatically be 
distributed among more databases by adding "subdomains" that point to 
different databases, and a set of rules to resolve which subdomain a 
file is in. The most basic possible; you can put whole domains on 
different databases.

Data redundancy? I would vote for the database admin to figure out their 
own redundancy. All of the Oracle shops I've dealt with have 
"Redundancy" through massive fiber SAN's, backups through tapes and SAN 
snapshots, and maybe a hotspare database server somewhere. Having data 
duplicated in mutiple databases becomes very expensive for places with 
non-Free DB software, and is often not done at all.

Folks who run MySQL or Postgres and worry about redundancy just buy a 
second server and DRBD or Replication slave it up.

Of course there're also wacky ideas... SQLite DBs distributed among all 
of the mogstored nodes with one per so many files? How does key 
discovery work if you want your database to be completely distributed? 
Anything else?

-Dormando