database-independence: it has begun.

Tue Jan 2 18:29:51 UTC 2007

On Tue, 2 Jan 2007, Hill, Greg wrote:

> > I don't know if you really want to add memcached at the tracker level.
> > By caching on the client side, not only do you take load off the
> > databases, you take load off of the trackers.  You'd be better off
> > writing a layer on top of the client side API to do the caching.  Its
> > alot easier to scale the client side of mogilefs then it is to scale
> the
> > server (tracker + database) side of things.
>
> Not sure I follow, are you saying to cache the actual files in memcache?
> AFAIK, we are going to use a file cache on the client side, but we
> wanted to limit load on the mysql servers by putting memcache between
> mogile and mysql.  Or, at the minimum, to have that option should load
> get too high on mysql (and with the current code, that would be
> non-trivial, but having a centralized querying class makes it simple).
> Mysql seems to be the one that's least likely to scale, since you can't
> distribute writes very easily with it.  We can add as many tracker boxes
> as we want.  Or maybe I'm just misunderstanding something.

He's saying you should have your app query memcached for "get_paths"
requests, instead of sending get_paths requests to your trackers.

Yes, it's easy to scale out MogileFS trackers by just adding more, but
it's quicker to have one roundtrip (app <-> memc) instead of two
(app <-> tracker <-> memc).

THAT SAID ... :)  I can add optional memcache caching to get_paths, for
those that want it.

I'm wondering if I should also add local in-core caching as well.  If I
did it in the parent process (the event loop at top), then I don't even
have to deal with hashing requests onto the right child.

- Brad