memory layout idea: the bulldozer
Joachim Bauernberger
joachim.bauernberger@friendscout24.de
Fri, 9 Apr 2004 10:33:02 +0200
Hi,
the following might be completely not interesting or whack :-)
I have been playing around with linux kernel tmpfs option once, which allow=
ed=20
me to use write/read system calls to write stuff to memory.=20
That way there is neither malloc nor disk access, etc ... to worry about.
I am no wizard when it comes to understanding the difference in Memory=20
Management on the OS level, so please be gentle in case this approach would=
=20
suck :-)
http://www-106.ibm.com/developerworks/library/l-fs3.html
Regards,
~/joachim
On Friday 09 April 2004 00:29, Brad Fitzpatrick wrote:
> Avva and I have been busy talking about new memory layout strategies that
> would be better inside memcached. The current approach is super paranoid,
> is easy, and uses very minimal CPU, but has two annoying properties:
>
> -- ~35% memory is wasted
>
> -- it requires slab page reassignment as usage patterns change,
> especially for smaller memcached instances
>
> So we've been talking about LRU'ing slabs instead of items (see previous
> mail), but has some annoying properties as well. Active items with
> sibling inactive items can hog slabs. So then we're talking about doing
> both item and slab LRUs and always deciding to either free the oldest item
> in the right size class, or free an entire slab (of any class), depending
> on what's older. That helps a little, but not much. There are then
> strategies to move all the items of like-size and like-popularity onto the
> same slabs, so slabs are uniformly popular or boring and the boring ones
> die off quickly. A coworker of mine pointed me at the "Train GC
> algorithm" for more on that, but I haven't read it yet.
>
> In the end, I'm really thinking a slab allocator isn't what we need.
>
> I've been thinking about the following model, which I've been explaining
> to people with a bulldozer analogy:
>
> -- You have a bunch of fairly large (~1MB) contig regions that form
> one huge logical region. (the small 1MB chunks are just an
> implementation detail to make sure the program is able to get
> 2GB or so of memory total, which it might not be able to do in
> one chunk? maybe so.)
>
> -- Starting from the left, you pack items in as tight as possible.
> (pointer aligned, basically. no free space between items)
>
> -- The point at which items are being added to this region is
> the bulldozer.
>
> -- The bulldozer point wraps around. Once it gets to the right,
> it starts over.
>
> -- If any item's in the way, they're freed. (this will require
> an index of the left side of all items...)
>
> -- If any item is within range of the bulldozer's horn when a GET
> is done on it, it's relocated behind the bulldozer and the
> bulldozer advances. (the horn might be 25% or 50% of the total
> memory?)
>
> -- So the only things that are bulldozed are things that were not
> accessed and moved before the bulldozer got to them.
>
> -- item updates can happen in place, if they're the same size or
> smaller. (like increments). or they can just move immediately.
> probably wouldn't matter much.
>
> Key advantages to this scheme:
>
> -- pretty damn simple
>
> -- no size classes. no tuning. no page reassignment.
>
> -- global item LRU, kinda. rough. should work?
>
> -- memory is packed pretty tight, and constantly being
> repacked by the bulldozer.
>
> Optional additions:
>
> -- keep index of free regions and use those instead of the bulldozer.
> that is, when something is deleted or GET'ed and found too old,
> we keep track of the hole in the compacted memory and reuse it later.
> then do some least-fit algorithm, finding the smallest free region
> new stores can go into. failing that, move the bulldozer and put it
> behind that.
>
> problem with that: the free regions will need to be bulldozed off
> the index too. which means:
>
> index to used items
> index from free size to free region
> index from free region to free size (to the previous index)
>
> we also probably want to bulldoze free regions a little bit further
> than normal items, since it'd be dumb to be putting new items just
> where they'd be soon to be bulldozed.
>
> so the whole free region thing may be more trouble than it's worth.
>
> Thoughts?
=2D- =A0 =A0 =A0 =A0=20
Web: http://www.bauernberger.de