PATCH: Enable use of large memory pages

Fri Feb 22 05:38:03 UTC 2008

Hey,

Some comments:

Minor bracket style... Most of the code is:

function (blah blah) {

vs some of yours is:

function (blah blah)
{

Also:

+    if (prealloc) {
+        /* Allocate everything in a big chunk with malloc */
+        mem_base = malloc(mem_limit);
+        if (mem_base != NULL) {
+            mem_current = mem_base;
+            mem_avail = mem_limit;
+        } else {
+            fprintf(stderr, "ABORT: Failed to allocate requested
memory\n");
+            exit(EXIT_FAILURE);
+        }
+    }

... probably shouldn't bomb out internally on failure.

Hrm. I think the flow might go over better if 'prealloc' is only set to
true if 'enable_large_pages' returns true. (it's void now, but you get
the idea). If you're not trying to set up a largepage environment,
you're more likely to fail a massive malloc() anyway.

Most of the text needs cleaning up, but I can do that.

However, the biggest issue here is that the whole thing is Solaris
specific :)

So unless you're running solaris, this essentially gives you an option
which will most likely just cause memcached to fail on large systems.

I don't actually know how to make this work under linux offhand.
Something something hugetlb. I can do the sysadmin-y parts and allocate
the 2MB pages in the OS, but will have to read up on the programmer-y parts.

What do you think about removing the option if built without the proper
support? The contiguous allocation doesn't give all that much benefit
otherwise.

Anyone else have thoughts toward this? Memcached is a perfect target for
HugeTLB support, and I can write adequate documentation for making it
work under linux (having done plenty of Oracle installations:)). *BSD
support is beyond me though, and I'd welcome some brainpower there.

Thanks!
-Dormando

Trond Norbye wrote:
> This patch implements support for using large memory pages on systems
> that supports it (must be enabled with -L on the command line). The main
> purpose of using large memory pages is to increase the address-space to
> be used without causing a TLB miss
> (see http://en.wikipedia.org/wiki/Translation_lookaside_buffer for a
> description of TLB.). When using large pages the slab allocator will
> allocate the total cache size during startup in one big malloc instead
> of calling malloc each time we need a new slab page (in order to get the
> biggest available pages on the system). (and since malloc use internal
> mutex-locking, we could block waiting for other threads calling malloc
> (with friends). Since access to the slab allocator is guarded by a
> single mutex, all access to the slab allocator would suffer from this.)
> 
> The tests remain to be written, and I am a bit unsure how to do it.
> There are at least two problems I see for a generic test:
> 1) Not all systems supports multiple page sizes (or that the code don't
> know how to enable them). On those platforms a warning is printed if you
> start memcached with -L, but you will still get the benefits from the
> slab allocator that it grabs all memory up front (benefit == you will
> not call malloc when a slabclass needs more space).
> 2) Your system may have been running for a long time, so that the
> system memory is too fragmented to actually be able to return large
> memory pages.
> 
> To verify the behavior without -L is pretty simple, just start memcached
> and look at the memory footprint. If we start with -L the memory
> footprint should be a little bit bigger than the memory requested with
> -m, but to verify that we got large pages are a bit more difficult (due
> to 2 above).
> 
> If we ignore the problem in nr. 2 above, we could (at least on Solaris)
> look at the result of "pmap -s <memcached pid>" and compare the
> pagesizes reported there with the one returned by "pagesize -a".
> 
> Comments anyone?
> 
> Trond
> 
> 
> ------------------------------------------------------------------------
> 
>