file side memcached

Mon Jun 5 16:53:31 UTC 2006

> Nah.   In both BDB and Memcached's case the most frequently
> accessed pages in the system will be in memory, and the least
> frequently will be on disk.   In BDB, the frequently accessed
> pages will be in the kernel's page caches and up-to-date
> versions will not be on the disk anyway (unless you send a
> bunch of unnecessary sync() calls).  And it's true with
> Memcached, where the least frequently used pages should
> be swapped to disk if there are better uses for the memory.

It works out to be a bit more random than that. "Recently" in terms of 
what chunks of memcached memory have been accessed means within the last 
instant, and there's absolutely no guarantee that memcached's slabs or 
BDB's memory chunks are going to align with the pages that go out to 
swap. You're much more likely to end up with your data fragmented in 
swap than you are with entire slabs of junk usefully moved to disk. Sure 
a few things here and there will usefully move out of the way but you 
most likely just killed performance of memcached by adding IO into the 
question.

> Totally disagree.   Swap is useful for infrequently accessed
> memory so that your valuable physical memory can be used for
> frequently accessed data, whether those pages are backed
> by files (bdb databases) or not.
> 
> Perhaps one of the most authoritative sources would
> be Andrew Morton (linux 2.6 kernel maintainer)
> who words it better than I can:
> 
> http://kerneltrap.org/node/3000
>    "My point is that decreasing the tendency of the
>    kernel to swap stuff out is wrong. You really
>    don't want hundreds of megabytes of BloatyApp's
>    untouched memory floating about in the machine.
>    Get it out on the disk, use the memory for
>    something useful."
>    - Andrew Morton

IIRC that discussion was about *desktop* machines. In typical desktop 
usage you're running 10+ applications with huge amounts of bloat. 
Knocking the swappiness value way up can help keep interactivity high 
with lowish memory setups, but only for the application you're currenty 
using.

On a *database* server, even with swappiness left to its own devices, I 
rarely have more than two megabytes in swap after normal usage. In a 
server system anything with huge amounts of bloat is not even running on 
the device! MySQL, apache, etc, don't have a lot of overhead going with 
them. At most a few megabytes, not hundreds of megabytes. If you have a 
server app that's wasting that much memory you have other issues.

A fun trick to play with your swap usefulness is to run a file copy 
backup on your server. Odds are the data you're pulling *once* off of 
disk to put elsewhere into a backup, will push out a huge percentage of 
your server's active data either into swap, or just directly out of page 
cache. Fedora had a fetish for doing this last I used it. If I tried to 
copy large files off of the box, it would swap to hell and back. Earlier 
releases of Fedora would constantly swap out pages of memory that were 
just swapped back in less than a second later, all in an effort to 
prioritize page cache over "inactive" application memory, which is 
mostly useless on a server!

How does the OS know the difference between me copying database files to 
another server, or me preloading database indexes into memory? You have 
to program explicitly for these cases.

Stupid swap :(

-Dormando