SQL/memcache tradeoff

Fri Oct 13 12:37:05 UTC 2006

On the whole the general case of grabbing a cached copy of an object  
that would otherwise have required a trip to the DB is fine,  
especially when you only request one item at once, e.g.

$o = new myobject($someid);

Might cause a DB hit the first time, then come out of cache the next.  
That's fine, works great. However, there are many occasions when I'm  
doing something that will return multiple items, such as:

$p = $o->getRelatedThings();

Especially when dealing with pre-memcache code that contains DB  
optimisations, this is likely to create multiple instances from a  
single DB lookup. As it's doing that, it can also stuff them in the  
cache. If you use this approach a lot, you end up doing far more  
cache writes than reads! A lookup returning 100 objects will result  
in 1 DB request, 100 instantiations, 100 cache writes and no cache  
reads.

So, if I lose the DB optimisation and go back to doing a single look- 
up for each object (playing to memcache's strengths, but giving  
terrible performance without it), I'd be doing <= 100 DB requests, <=  
100 cache reads, <= 100 cache writes, 100 instatiations (or  
__wakeup). At some point there must be a tradeoff between doing a  
more efficient DB lookup and doing more cache reads. How many items  
is that likely to be for? For 100 items, 100 memcache lookups would  
have to be faster than one DB lookup for 100 items. Is that the case?  
Is it typically higher? If so, how much?

Is there a middle way? Is it worthwhile culling the requests for  
instances that are in the cache already before submitting the request  
to the DB? Would the overhead of doing that pay off?

Marcus
-- 
Marcus Bointon
Synchromedia Limited: Creators of http://www.smartmessages.net/
marcus at synchromedia.co.uk | http://www.synchromedia.co.uk/