<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Dustin Sallings wrote:

<blockquote cite="mid23BD2364-7A7D-46E0-8225-F2E54E9436DD@spy.net"

 type="cite">

  <div><span class="Apple-tab-span" style="white-space: pre;"></span>I

measured a considerable improvement in the initial version of my

implementation with a single thread over the multithreaded

implementation.&nbsp; I'll go ahead and try adding multiple connections per

destination and see if it makes a difference.<br>

  </div>

</blockquote>

<br>

I'd be perfectly willing to believe the other implementation is just

not as efficient as yours. Please try using multiple connections; I'm

sure everyone here would be curious to hear the results.<br>

<br>

<blockquote cite="mid23BD2364-7A7D-46E0-8225-F2E54E9436DD@spy.net"

 type="cite"><span class="Apple-tab-span" style="white-space: pre;"></span>Does

that really give a considerable performance benefit over running four

single-threaded processes on the same box?&nbsp; I'd be concerned about

locking (and correctness) on the server-side.</blockquote>

<br>

For small-scale installations, the difference is probably not too

significant. But in our case, we see significant gains in two areas. By

running 1/4 the number of processes, we quadruple the chances of a

large "get" request wanting multiple objects from the same instance,

and a two-key "get" is far more CPU-efficient than two one-key

requests. Second (though this is mitigated to a large extent by using a

UDP-based client) we only have 25% as much memory devoted to I/O

buffers, including kernel socket buffers. And in addition to those

gains, it is also easier for our operations people to manage one

process per box than four.<br>

<br>

As for locking and correctness, please feel free to audit the code; the

locking should be pretty easy to follow. We are running it on about 150

high-traffic dedicated memcached hosts right now and it has been stable

and (as far as we know) error-free for us, but of course it's always

possible there are bugs.<br>

<br>

<blockquote cite="mid23BD2364-7A7D-46E0-8225-F2E54E9436DD@spy.net"

 type="cite">

  <div><span class="Apple-tab-span" style="white-space: pre;"> </span>Is

there any documentation on how this branch uses threads?&nbsp; It'd be an

interesting read.</div>

</blockquote>

<br>

doc/threads.txt has some information about the implementation. It is a

very conservative implementation; the code can be compiled as a

single-threaded application by leaving out a single -D option on the

compiler command line, in which case the execution paths are nearly

identical to the 1.2.x code base.<br>

<br>

It would be possible to do more major surgery, of course, if one were

willing to give up the option of compiling it single-threaded. And even

without giving that up, there are some obvious changes that could be

made to decrease lock contention (see the "TO DO" section in

doc/threads.txt). But the current implementation is sufficient for our

setup.<br>

<br>

-Steve<br>

</body>

</html>