db-independence update; call for hackers

Brad Fitzpatrick brad at danga.com
Fri Jan 5 07:09:32 UTC 2007


Earl,

On Thu, 4 Jan 2007, Cahill, Earl wrote:

> > But am I doing this for fun?
>
> I would say +1 even if all we got out of it was ability to use memcache
> easier.

Memcache's going in independent of all this anyway.  It'll be done only
for get_paths (the only part that matters for perf), and done in front of
the Store interface.  See, the problem with doing it in the Store
interface is that different callers want different data-correctness
guarantees.  Also, we probably also want per-process (parent process)
caching, which means we can't use store, because that happens in the child
processes.  So get_paths will be a little bit special.

> That said, let's say you want to store 100 billion photos.  I
> am pretty convinced that mysql will likely have a hard time keeping up,
> but I would have more confidence that an oracle solution might exist.

Well, yes, but I'm wanting to add generic partitioning in Mogile that
works on top of any other store... so you can do partitioned MySQL,
partitioned Postgres, etc.  (and be able to add/remove databases at
runtime... )

Then I want to start thinking about letting users configure different
Stores for small metadata (host/device/domain/class list) vs huge stuff
(file/file_on).  My eventual goal is people will run MySQL Cluster over
3 nodes for the small stuff, then partition (with redundancy) the big
stuff over a lot of disk-based databases (MySQL, Pg, ...).

> I also like the idea of queries being centralized, so as to more easily
> add logging.  Did it on a recent project and went from I think hundreds
> of queries to generate a page (even on the second hit), no a few on the
> first hit and zero on the second hit.  Pretty hard to do that analysis
> when the queries are spread out everywhere.

DBI has hooks for this client-side.  As should your database (MySQL does).
Less of a concern for me.

> > So ... any Postgres/Oracle-using SQL/Perl hackers out there?
>
> Done plenty of both, how can I help?

If I haven't dissuaded you, try to get the test suite running (and
passing) with Postgres and/or Oracle.

$ cd $HACKBASE
$ svn co http://code.sixapart.com/svn/mogilefs/trunk mogilefs
$ cd mogilefs/server
$ cp lib/MogileFS/Store/MySQL.pm lib/MogileFS/Store/Oracle.pm
$ $EDITOR lib/MogileFS/Store/Oracle.pm
$ $EDITOR lib/MogileFS/Store.pm
$ $EDITOR mogdbsetup

And once you think it's ready, hack up the test suite to make temporary
Pg/Oracle databases instead of temporary MySQL databases:

$ $EDITOR t/lib/mogtestlib.pl

And run the tests with:

$ prove -v t

(make sure the tests all pass with MySQL before you start... :))

- Brad



More information about the mogilefs mailing list