Linux & memcached

Anatoly Vorobey mellon@pobox.com
Fri, 17 Oct 2003 23:31:13 +0200


Lately there have been many reports about bad performance of memcached
from people using it on Linux with inappropriate kernel/libevent 
versions. We thought it might be a good idea to summarise again the 
current state of affairs in this area:

1. memcached *needs* a fast and reliable event notification mechanism to
operate. It uses libevent; however, libevent itself is merely a (good) 
set of wrappers around various OS-provided event notification mechanisms. 
When libevent is initialized on program startup, it finds the best 
event notification mechanism it can use (the best among those thare are 
both supported by this version/build of libevent, and supported by the 
actual runtime environment - OS and kernel version), and uses it. Since 
there're always two lowest common denominator mechanisms, available 
everywhere - select, based on select() and poll, based on poll() - 
libevent always *will* successfully initialize and memcached will work. 
But its performance will be abysmal if the mechanism selected by 
libevent is not good enough, either due to its inherent limitations, or 
due to bugs/shortcomings in libevent's wrapping of this mechanism.

2. Versions of libevent < 0.7 are bad. They're too old, support too
few event notification mechanisms, and contain bugs in their support of
the better ones. Use at least 0.7a, and on Linux, use at least 0.7b
(the current version), since it fixes bugs in the most important 
mechanism on Linux, epoll. Don't use 0.6.x even if it's the stable 
version for Debian or whatever else - they're wrong in this case. 
libevent has been steadily improving, and latest versions have been more 
stable/bug-free than their predecessors, at least lately.

3. On FreeBSD and other BSDs, all recent versions of libevent support 
the kernel queues mechanism (kqueue) which is very fast and nice. It 
should always be the one selected by libevent when it initializes. If 
you build your memcached against a sufficiently recent libevent (see 2. 
above), you should be all set on BSDs.

4. On Linux, the best notification mechanism, the one typically used in 
production when memcached is used on Linux servers, and the one we use
here on LiveJournal.com, is called 'epoll'. However, it became 
part of the distribution only in 2.6 kernels (in 2.5 really, but 
we're talking stable here). In 2.4 kernels, it can be used and is used,
but the kernel must be patched and rebuilt; the current version of the
epoll patch for 2.4.21 is at
http://www.xmailserver.org/linux-patches/epoll-lt-2.4.21-0.18.diff
You'll also need (on 2.4) userspace epoll libraries installed prior to
building libevent, otherwise libevent won't compile its epoll support 
in. Those libraries are available at
http://www.xmailserver.org/linux-patches/epoll-lib-0.10.tar.gz

There're more detailed instructions about all this in the BUILD file
of the memcached distribution.

5. If you're using memcached on Linux without epoll, currently your
performance *will* suck. libevent will select the 'poll' method, which
turns out to be horribly slow. We're not sure why it turns out to be
*that* slow - much slower than it should be even considering that poll
is inferior to epoll. It may be due to bugs in libevent's wrapping of
poll() support. We're investigating this issue. 

6. libevent 0.7b introduced a new 'rtsig' notification method for linux; 
rtsig stands for real-time signals. However, it's buggy (not the OS
support for real-time signals, but libevent's wrappers); it 
doesn't really work yet, and for that reason it's not even compiled in 
by default when libevent 0.7b builds, you have to specifically request 
it if you want it. My current project is getting rid of bugs in rtsig
support and making it stable. When this is done (hopefully very soon),
there will probably be a new version of libevent with those fixes, and
*then* everyone who wishes will be able to use memcached on Linux 2.4
without kernel patches for epoll and have decent performance. rtsig 
should be somewhat worse than epoll, but still very fast.

7. (important!) You can check which event notification method your 
memcached uses, at runtime. Set the environment variable 
EVENT_SHOW_METHOD and libevent will print the method it selected on
initialization. For example, in bourne shell environment:

$ EVENT_SHOW_METHOD=1 ./memcached
libevent using: rtsig

(that was my working copy of memcached which I'm using to debug rtsig
support).

IMPORTANT: if you do that, and you get "libevent using: X" where X is
NOT 'kqueue' or 'epoll', then, _currently_, you're not using memcached
correctly. Your performance will suck. Hopefully after the current round
of libevent development is done, 'rtsig' and 'poll' will begin to work 
nicely as well. But right now, 'kqueue' on BSDs and 'epoll' on Linux
are the only methods which you should use with memcached.


--
avva