MogileFS limits...

Sat Apr 12 23:48:30 UTC 2008

Dormando,

Thanks for the tips! We've got a project underway already to move to
InnoDB but it's a bit of a hassle (particularly since we really really
don't like going offline for maintenance). And yeah, we got some bad
mental habits from paying for servers out-of-pocket, but we'll
certainly be going a bit upmarket from our current 0.5U pizza boxes
for our tracker dbs. :)

Ask - good data points.

-n

On Sat, Apr 12, 2008 at 4:27 PM, dormando <dormando at rydia.net> wrote:
> Nathan Schmidt wrote:
>
> > Dormando,
> >
> > The most intense performance hits for us have come from losing storage
> > nodes and the resulting rebalance operations putting a lot of
> > read/write load on the tables. Worst case we've seen was almost 20
> > hours of thrashing. Not really user-affecting but it did peg a lot of
> > ops charts.
> >
>
>  We have huge storage hosts (bad idea, I know). So losing an entire host,
> while infrequent, can cause thrashing of up to a week. Lowering the
> replicate workers prevents it from actually hurting anything, but it does
> run full tilt for a while. We have some ideas floating around to make this
> less painful / adjustable. Probably won't do them for a while though :\
>
>
>
> > We're far from optimal - using UTF-8 collations/charsets which seems
> > to unconditionally add a great deal of overhead to index sizes. Our
> > initial bring-up was on a clean debian box so we went with MyISAM,
> > which does have its problems - index corruption, O(n) repairs, ugh.
> > Functional but non-optimal. Good point about the optimize table
> > routine, we probably haven't done that for a while.
> >
>
>  Oh man :)
>
>  Switch to InnoDB as soon as possible! The access patterns were designed for
> InnoDB, and you're very likely to get lock thrashing under MyISAM. MogileFS
> also counts on the database being fairly reliable.
>
>  Get a box with a decent RAID housing battery backed write cache, get more
> RAM, and switch to InnoDB :) Should be way faster even if the data size
> jumps a bit during the switch. OPTIMIZE TABLE still applies.
>
>
>
> > We do a lot of list_keys operations when our caches are cold because
> > we're using mogile as our primary datastore for wiki page revisions -
> > we didn't really anticipate the rest of our infrastructure handling
> > this kind of data size so well so mogile's been left alone while the
> > rest of our system has become much more efficient and
> > well-distributed. We're on all-commodity hardware, and those boxes max
> > out at 8Gb each, which puts a ceiling on where we can go without
> > moving to some substantially more expensive systems. Our file.MYI and
> > file_on.MYI are 8.5Gb together, which means MySQL is doing some
> > interesting tricks to keep things running smoothly. That said, we're
> > getting ready to bite the bullet and bring in somewhat more
> > substantial boxes for our tracker dbs, oh well.
> >
>
>  I think 16-32G are still commodity (see Ask's response)...
>
>  I don't really believe you lose out once the index is large than RAM.
> Especially with MyISAM files, you lose out when your key_reads over
> key_read_requests get above nothing. A bulk of your data is unlikely to be
> active, so the entire index is never cached in RAM. Under innodb you give
> both the index _and_ data caches a large sum of memory, which are also
> LRU's.
>
>  So while my mogilefs DB's have 50G+ datafiles sometimes, 8-16G of RAM is
> plenty. The writeback cache is a gigantic win due to all of the scratch
> writes mogilefs does. Also switching to deadline elevator for the io
> scheduler... The read is read cache, and you're caching hot paths :)
>
>  -Dormando
>

-- 
New office! 1825 South Grant Street, Suite 850 San Mateo 94402
New home! 21677 Rainbow Drive Cupertino 95014
New phone! 415.420.1647