I would imagine when you start seeing more evictions in your stats than you expect, you know your cache is full. I'd say anything else is over-complicating it.<br><br><div><span class="gmail_quote">On 5/25/07, <b class="gmail_sendername">
Ben Hartshorne</b> <<a href="mailto:memcache@green.hartshorne.net">memcache@green.hartshorne.net</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
OK, so the best I can come up with is to insert a new record every x<br>minutes (say, 60), choose what threshold you expect or care about<br>records not getting flushed due to overflowing the cache, wait that<br>long, and then check the record and see if it's still there, verifying
<br>that your cache has not been blown. This seems rather crude, and only<br>tells you when you have already lost your cache, not when you are<br>getting close to doing so (though you can double your time threshold and<br>
hope that's good enough).<br><br>So for example, I am interested in making sure a record persists for 6<br>hours. I double that time to give a bit of a threshold so as to get<br>some warning. I insert 12 records, one every hour. After all have been
<br>inserted, I check the 12th record before resetting it. If it's still<br>there, my cache has enough space that records can last 12 hours. If<br>it's gone, I can step through them to determine how much of the cache
<br>had to be freed to make room for the current data.<br><br>Has anybody done this? Maybe as a nagios plugin?<br><br>This procedure does raise a question though, about the details of which<br>records are dropped when the cache becomes full. When the cache becomes
<br>full, how is it determined which records to drop? Oldest access time?<br>Oldest insert time? Largest record? I'm kinda hoping it's oldest<br>access time, because that makes the most sense for our environment
<br>(since popular records will stay in the cache longer than unpopular<br>records). :)<br><br>Thanks,<br><br>-ben<br><br><br>On Thu, May 24, 2007 at 12:49:54PM -0700, Ben Hartshorne wrote:<br>> Hi,<br>><br>> I am trying to evaluate the schedule for expanding our cache. We have
<br>> several different types of data all getting thrown into one large<br>> central cache (made up of 1 instance on each of 4 machines). Some of it<br>> is transient (caching 'popular' data for 5 minutes) and some of it is
<br>> more permanent (1wk expiration time). All data is backed by permanent<br>> storage for cache misses. On cache miss, the data is repopulated into<br>> memcache from permanent storage.<br>><br>> My problem - the 'stats' command has a metric 'curr_items' that reports
<br>> the current number of items stored. However, when a piece of data<br>> expires, that counter is not decremented until you issue a 'get' on the<br>> data and fail.<br>><br>> In order to cache 'popular' data, all data of that type is cached,
<br>> assuming that the popular ones will be hit and updated within the<br>> expiration time, and so remain in the cache, while the unpopular data<br>> will just expire and nobody cares.<br>><br>> The problem with the curr_items stat is that if I ever get a cache miss
<br>> on the transient data, I immediately fetch it from the database and<br>> stick it back in the cache, causing the curr_items to decrement and then<br>> increment again. Data that is unpopular is stored (causing an
<br>> increment) but never retrieved, so the curr_items never decrements. The<br>> effect is a monotomically increasing number in curr_items until it tops<br>> out (at 2352883, though I'm not sure what's special about that number).
<br>><br>> Because of the different types of data and the changing popularity of<br>> data, cache hit percentage is not a good proxy for telling me when my<br>> cache has filled up.<br>><br>> At the moment, I am pretty sure the cache is not full because the more
<br>> persistent data (1wk expiry time) usually sticks around, though I don't<br>> have a good metric to prove that, it seems to be the case.<br>><br>> How do I tell when my cache is full and I need to add the 5th server? I
<br>> tried watching the memory utilization as reported by the OS but it is<br>> also monotomically increasing until it tops out at the limit given to<br>> memcache (7GB, in this case).<br>><br>><br>> Thanks for any advice you might have,
<br>><br>> -ben<br>><br>><br>> --<br>> Ben Hartshorne<br>> email: <a href="mailto:ben@hartshorne.net">ben@hartshorne.net</a><br>> <a href="http://ben.hartshorne.net">http://ben.hartshorne.net</a><br>
<br><br><br>--<br>Ben Hartshorne<br>email: <a href="mailto:ben@hartshorne.net">ben@hartshorne.net</a><br><a href="http://ben.hartshorne.net">http://ben.hartshorne.net</a><br><br>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG
v1.4.1 (GNU/Linux)<br><br>iD8DBQFGVySPKeT3tvTdv64RAt3iAKCioA54txWxS7HjNJeBvzEd50yzqwCcCFSl<br>7F6qx4H64dJkWQAzu9ptDlQ=<br>=pAVS<br>-----END PGP SIGNATURE-----<br><br></blockquote></div><br>