how to tell when your cache is full?

Ben Hartshorne memcache at
Fri May 25 18:01:51 UTC 2007

OK, so the best I can come up with is to insert a new record every x
minutes (say, 60), choose what threshold you expect or care about
records not getting flushed due to overflowing the cache, wait that
long, and then check the record and see if it's still there, verifying
that your cache has not been blown.  This seems rather crude, and only
tells you when you have already lost your cache, not when you are
getting close to doing so (though you can double your time threshold and
hope that's good enough).  

So for example, I am interested in making sure a record persists for 6
hours.  I double that time to give a bit of a threshold so as to get
some warning.  I insert 12 records, one every hour.  After all have been
inserted, I check the 12th record before resetting it.  If it's still
there, my cache has enough space that records can last 12 hours.  If
it's gone, I can step through them to determine how much of the cache
had to be freed to make room for the current data.

Has anybody done this?  Maybe as a nagios plugin?

This procedure does raise a question though, about the details of which
records are dropped when the cache becomes full.  When the cache becomes
full, how is it determined which records to drop?  Oldest access time?
Oldest insert time?  Largest record?  I'm kinda hoping it's oldest
access time, because that makes the most sense for our environment
(since popular records will stay in the cache longer than unpopular
records).  :)



On Thu, May 24, 2007 at 12:49:54PM -0700, Ben Hartshorne wrote:
> Hi,
> I am trying to evaluate the schedule for expanding our cache.  We have
> several different types of data all getting thrown into one large
> central cache (made up of 1 instance on each of 4 machines).  Some of it
> is transient (caching 'popular' data for 5 minutes) and some of it is
> more permanent (1wk expiration time).  All data is backed by permanent
> storage for cache misses.  On cache miss, the data is repopulated into
> memcache from permanent storage.  
> My problem - the 'stats' command has a metric 'curr_items' that reports
> the current number of items stored.  However, when a piece of data
> expires, that counter is not decremented until you issue a 'get' on the
> data and fail.  
> In order to cache 'popular' data, all data of that type is cached,
> assuming that the popular ones will be hit and updated within the
> expiration time, and so remain in the cache, while the unpopular data
> will just expire and nobody cares.  
> The problem with the curr_items stat is that if I ever get a cache miss
> on the transient data, I immediately fetch it from the database and
> stick it back in the cache, causing the curr_items to decrement and then
> increment again.  Data that is unpopular is stored (causing an
> increment) but never retrieved, so the curr_items never decrements.  The
> effect is a monotomically increasing number in curr_items until it tops
> out (at 2352883, though I'm not sure what's special about that number).  
> Because of the different types of data and the changing popularity of
> data, cache hit percentage is not a good proxy for telling me when my
> cache has filled up.
> At the moment, I am pretty sure the cache is not full because the more
> persistent data (1wk expiry time) usually sticks around, though I don't
> have a good metric to prove that, it seems to be the case.
> How do I tell when my cache is full and I need to add the 5th server?  I
> tried watching the memory utilization as reported by the OS but it is
> also monotomically increasing until it tops out at the limit given to
> memcache (7GB, in this case).
> Thanks for any advice you might have,
> -ben
> -- 
> Ben Hartshorne
> email: ben at

Ben Hartshorne
email: ben at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url :

More information about the memcached mailing list