Multiple trackers on a Mysql multi-master setup
dormando
dormando at rydia.net
Fri Feb 1 05:00:26 UTC 2008
> The trackers don't need to know about eachother, they're happy
> existing in their own little world. The only thing you will want to
> watch is the fsck job, IIRC you should only run one of those across
> the cluster. (I don't think it breaks anything to run it from two
> locations, it's just not efficient/extra load.)
It actually elects which tracker to run the fsck service on. So you'd
set one per tracker and leave it like that.
The rest of what Mark says is right :) They only know of each other
through locks on the master database (which I just said is pretty gosh
darn important). Otherwise they just work in parallel in ways that're
not-quite-perfect but otherwise totally awesome.
>> Also, I want to set this up in a mysql multimaster situation, probably
>> having each tracker talk to the mysql instance on the machine it's on. From
>> what I know about mogile, that shouldn't be a problem. Is there anything I
>> should know about setting mogile up in this way?
>
> Actually, I don't think MogileFS supports multiple masters in that
> way. Not by default anyway, since it uses the MySQL AUTO_INCREMENT
> type, which isn't going to work in a master-master setup if you're
> writing to both. (Key collisions.)
I'd seriously insist that the trackers should use the same DB as their
"master", and if you want to spread out reads use slaves for the
get_paths commands.
However, I've ran mogilefs on some _gigantic_ sites, and doing path
caching *outside* of mogilefs (cache the paths in memcached for 15
minutes, don't even talk to the tracker if you have it in cache) will
reduce 90% of the DB load and allow you to go on for quite a while.
Potentially forever :)
>
> However, you can get around this by assigning FIDs manually when you
> insert files. You'd need a separate key generation application and
> perhaps use something like 64 bit numbers to reduce the collisions.
> That would allow you to do the kind of setup you're talking about.
I think we should disable this feature. Using auto_increment_offset and
only pointing to one machine at a time appears to work fine (or using
DRBD if all you want is redundancy). Presently the directory hashing
algorithm degrades if the number is "bigger" than ten digits. The other
issue is wanting to keep digits in line with what innodb wants to do
clustered reads more efficiently.
> I've not personally spent any time testing MogileFS in a master-master
> configuration, but I can't see why it shouldn't work. YMMV, let us
> know if you do any testing. Of course the kind of problems you'd
> expect won't really show up in lightly loaded conditions...
I built gaia's like this, and there's one at sixapart now doing it. (by
this, I meant auot_increment_offset multi-master, _only_ one side
active). Failover will lose any files in the process of being uploaded,
get_paths in action, etc, but it'll keep on trucking just fine afterwards.
-Dormando
More information about the mogilefs
mailing list