Caching collections of objects
brianm at dealnews.com
Sun May 27 14:54:45 UTC 2007
We have been caching at dealnews in some form since the company began 10
years ago. We have been using memcached for over a year.
> 1. Caching of collections (eg: "give me all the user comments related to
> 2. Caching entire page outputs based upon the unique url (eg: "give me
> the xhtml output for foo/bar?baz=1")
> 3. A combination of #1 & #2
We do both at dealnews.
> Getting the data into the cache and retrieving it when appropriate is a
> simple matter. It is when the data in the cache has become stale and I
> need to flush it from the cache that I become stuck as to how best to
> solve the problem.
I have seen this worry a lot on this list. For us it is all about the
ttl. We decided on a ttl we could live with for objects. Its just 2
minutes for our front page. But, with a 2 minute ttl we get a 85% cache
hit rate. Well worth it. For other pages its 15 minutes. Some its an
hour and really old content is cached for a day. When you start getting
into serious traffic, you have to let go of the obsession that the
content all gets updated at the exact same time on every page
everywhere. Its just not realistic anymore.
For object level stuff, we do some updating. We have processes that
regenerate content and it freshens that memcache data when needed. But,
those are very few objects that are hooked into our existing publishing
system. We don't looking for every place that object X may be on a page
and remove it. We let the ttl take care of that.
The important part of this method however is that you must be able to
deal with having your cache expire at some point gracefully. If your
site can't deal with having a couple of pieces of expired cache on the
page, then you will be in trouble.
> One solution for approach #2 above is to simply flush all cached page
> data whenever there are writes to the database. Though this is
> sub-optimal and would result in low cache hit rates I'm assuming.
I don't know how often you write to your database. But, yeah, that
would be quite useless for us. If you are using mysql, you can just use
the mysql query cache for that effect.
It's good to be cheap =)
More information about the memcached