MOM: Memcached Operations Monitoring

Randy Wigginton krw at
Thu Oct 11 21:16:01 UTC 2007

Hi All,

My company does something unusual with memcached that is extremely  
valuable to us, and I'm wondering if others would find the code  
useful.  I apologize if this email is long, but what I'm proposing  
requires some background explanation, as it is radically different  
from the typical use of memcached.  The ultimate question is whether   
it is worthwhile to make the code available for others to use.

One problem with running a large site (hundreds of millions of hits  
per hour) is keeping track of what is going on.  At one point I  
worked on the Ebay swat team, and I would get calls at 2 in the  
morning from the operations center, wondering why a set of machines  
was acting up.  Without instrumentation, it is nearly impossible to  
figure out.  With the proper instrumentation, it is child's play.   
Ebay has a VERY large system to track all activity on the site;  
however, that system is VERY large and VERY expensive.  Using  
memcached, I've developed something that gives you 90% of the value  
of Ebay's system for perhaps 1% of the cost.

I have modified memcached as well as the java client library; with  
these modifications, and very few lines of code in the application, I  
can tell precisely how many URLs and SQLs are executing on a  
particular machine in any given minute or in any given hour.  I can  
tell you the average execution time, the maximum execution time, as  
well as the number of failures.  I can tell you which URLs were  
expensive, which URLs invoked SQL statements, which urls failed most  
often.  Coupled with a small mysql database, I can give you more  
operational statistics on our site than many larger sites have  

Just to reassure those who are assuming this must be a very expensive  
use of memcached, I can say from experience that with a single  
instance running on a linux box with a mere 10M of memory assigned,  
we aggregate information on about 20 pools and several hundred  
machines at a rate of 10-15K operations per second, and have never  
gotten close to capacity.

Would anyone else be interested in this?  Or is this too far off the  
beaten path?  It is mostly helpful for very busy sites.  Thanks.


More information about the memcached mailing list