[PATCH] minor cleanups

Steven Grimm sgrimm at facebook.com
Fri Apr 13 19:44:11 UTC 2007


Brad Fitzpatrick wrote:
> Assuming the code is battle tested & stable, the only remaining questions
> are around the invasiveness of the code, and long-term maintenance cost,
> compared to the advantages.  Debugging and hacking on a single-threaded
> app is a breeze compared to debugging buggy locking.
>   

I did my best to minimize the invasiveness of the changes. Obviously I'd 
love it if others could offer their opinions. All the thread support 
goes away (i.e., you don't pay any overhead at all) if you don't add 
"--enable-threads" to your ./configure command line; the code is 
structured such that all the thread stuff is inside #ifdef blocks that 
revert to the single-threaded behavior if a particular macro isn't defined.

However, that does not negate the point about maintenance cost; 
obviously it is possible to introduce a bug that doesn't manifest at all 
in single-threaded mode but kills the process when it's compiled with 
thread support. To that end, I tried to make the locking as simple as 
possible. Again, if people have feedback on that I'd love to hear it.

> And yes, multi-core is common, but there's another easier answer:  run 'n'
> processes per machine, which is what everybody does now.  Yes, it's
> currently a manual process, but it could be automated.  That also is some
> more work, though.
>   

And it isn't without its downsides, either. We used to run that way. But 
it actually doesn't give you as much capacity per machine as running one 
multithreaded process, for a few reasons:

* If you have objects in rarely used size classes you'll be more likely 
to waste memory on them because you'll possibly get one slab of that 
class for each instance, rather than just a single slab that they can 
all share.
* If you're using persistent TCP connections, you will eat memory for 
kernel (and application) I/O buffers for the extra connections, reducing 
the amount of memory you can devote to your cache.
* Large batch "get" requests will have to be split among all the 
processes, which is much less CPU-efficient than sending the entire 
batch to one process, which means you will max out your CPUs at fewer 
requests per second.

And of course managing multiple processes per machine adds a bit of 
system administration complexity, though that'll vary depending on 
people's setups.

> So I guess it's time to start looking at diffs between trunk and
> multithreaded and see if it's small/simple enough to merge.  (and assuming
> it's still a compile-time option, to complete disable threading...)
>   

Yep, it is! That was a big goal of my changes, since I knew not everyone 
would need or even want threading.

-Steve


More information about the memcached mailing list