memory fragmentation issues in 1.2?
Steven Grimm
sgrimm at facebook.com
Thu Dec 7 18:47:25 UTC 2006
Paul T wrote:
> Slabs come with a price - they make valgdrind useless,
> they were causing wasting up to 50% of RAM (dunno if
> Facebook fixed that, I suppose they did, but there
> was no benchmark to see how much RAM is still getting
> wasted in those 'regions') ... and it turns out that
> memory is still leaking ...
>
First of all, if you don't like the slabs, you can compile with
-DUSE_SYSTEM_MALLOC to bypass them completely and use the system malloc
instead. At that point you should be able to use valgrind as you see
fit. You will have to live with slightly higher CPU consumption (5-7%
higher in my tests, though that's obviously highly dependent on your
system malloc implementation). That said...
It's trivial to see the memory overhead, at least once the cache fills
up. The "stats" command will show it to you. Here's the relevant part of
the output from one of our servers:
STAT bytes 12224855073
STAT limit_maxbytes 13631488000
The second number is the configured memory limit, i.e., the total size
of all the slabs once the cache is full. The first number is the total
size of all the items in the cache, including their per-item headers. So
from that, you can see that this particular cache instance has a memory
efficiency of about 90%. Still plenty of room for improvement, no
argument there (and I happen to know that at least one person is working
on it) but it's in the realm of reasonableness, at least for us,
considering the CPU efficiency gain and considering that we might well
lose a similar amount of memory to fragmentation in a non-slab
environment anyway.
As for memory leaks, I'm not denying it's *possible* there's a leak, but
as one of the biggest memcached installations out there, we haven't run
into it. Here are another few excerpts from our stats output:
STAT uptime 3180512
STAT cmd_get 24710319401
STAT cmd_set 876632609
That translates to about 36 days of uptime, and as you can see from the
command counts, our instances aren't exactly sitting around idle. And
they are not growing steadily over time. We peak at over 30,000
connections (this instance doesn't use UDP for various reasons), so we
also see:
STAT connection_structures 36656
which does cause some memory overhead. As you can see, our maximum
configured size is 13000 megabytes; right now, according to "top", the
process size of the instance in question is 14.0GB. So we have just over
1GB of overhead -- but that holds steady once we've hit peak load a
couple times and all the connection structures that need to get
allocated are allocated. It is not a steadily increasing number -- we'd
certainly know if it was, since the machine in question only has 16GB of
RAM and we'd be in a world of hurt if memcached started swapping.
> Once upon a time instead of arguing for years about
> (many) moments like slabs I just bit the bullet and
> rewrote the whole thing without the slabs, without
> timers, without proprietory semi-binary protocol,
> without fancy (but logically questionable) 'automata'
> protocol implementation, without 'custom hash' e t.c.
> e t.c.
>
How is a state machine in any way logically questionable? More to the
point, how *else* would one implement any protocol at all in a
nonblocking, async-I/O-based environment where only part of a request
might have arrived at any given point? How do you handle getting a
partial HTTP request in univca, if not with a state machine of some kind?
That's assuming by "automata" you mean finite-state automata, i.e.,
state machines. If that's not what you're referring to then I'm not sure
what you mean.
> Univca is *more* portable already - univca is using
> HTTP for a protocol hence it works with all the tools
> out there, that support HTTP protocol - memcached is
> using proprietory protocol that suffers from
> big/little endian problems.
>
Okay, you've thrown me for a real loop here. Memcached's protocol is
*text-based*. Human-readable, as in not binary. One can (and I
frequently do) telnet to its TCP port and type commands into it. I'm not
aware of anywhere in the memcached protocol where you could even
*detect* what byte order the server is using, let alone where there's a
dependency on it or a problem resulting from it. If you know
differently, please tell me where it is, specifically!
I can do this:
pinklady% telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
set foo 0 0 5
hello
STORED
set bar 0 0 6
pounce
STORED
get foo bar
VALUE foo 0 5
hello
VALUE bar 0 6
pounce
END
All perfectly human-readable (not binary) and no byte order
dependencies. Which parts of the protocol are you referring to? The only
thing I can think of that you might be referring to is the UDP header,
but (a) the UDP protocol is totally optional, and (b) all its header
fields are explicitly defined in the protocol spec to be in network byte
order, so I'm not sure what little/bigendian problems there are with it.
-Steve
More information about the memcached
mailing list