What expense is there running across servers?

Patrick Galbraith patg at grazr.com
Wed Nov 7 15:54:13 UTC 2007

There is some question in discussing using memcached at my company, and 
how best to get the most out of it. One main point is whether or not to 
run a cluster across machines, even across data centers, vs each server 
having it's own cache.

What we will be using memcached for is to serve out json (a json cache) 
to our grazr widget. Currently, this is being done with mysql, using 
replication to replicate this cache (it's currently two tables, lookup 
table, and blob table 1:1). This was all originally done with flat 
files, each differing across servers. With replication, if one server 
caches a feed (this caching process is done with expensive mod_perl 
application code), the feed is stored with a write handle to the master 
(one master in each data center with web-heads and slave db attached), 
that cached json representation of the feed is replicated out to all 
slaves, all slaves now can serve the 'cached' json vs. having to load it 
with mod_perl code.

With memcached, the discussion now is whether we have one instance 
across machines with all cached json being available across the 
paradigm, or to have separate caches on each box.

I'm in favor of a single cluster. Whatever small overhead there is to 
"replicate" the cached json across memcached instances is much smaller 
than having to possibly re-cache the feed on a box that does have the 
feed, when if it were one cluster, it could look it up and serve from cache.

What I'm asking about in this post is how it would be argued from others 
that a single instance is most optimal solution, and what overhead is 
there to network this cluster? What are other's experiences out there? 
Are there any tests that have been done to see how this works?

I understand the technical explanation and don't need convincing that 
libevent makes "replication" (or "glue") of the cluster fast. Also, 
everyone that I know of who uses memcached uses it in a single cluster. 
Maybe that's a bit simplistic and is monkey-see-monkey-do, but I would 
suppose people have implemented to this architecture for a good reason!

What do others think, and what would you all contribute to this discussion?

Thanks in advance!


Patrick Galbraith, Senior Programmer 
Grazr - Easy feed grazing and sharing

Satyam Eva Jayate - Truth Alone Triumphs
Mundaka Upanishad

More information about the memcached mailing list