Memcached Database Use
dustin at spy.net
Fri Jun 22 22:51:51 UTC 2007
On Jun 22, 2007, at 13:56 , Chris Miller wrote:
> I see how that by storing database results in memcached would be
> very helpful, but how does memcached know when the result set in
> cache has changed?
I wrote an app called diggwatch that uses the digg API as my
primary data store, and stores all the useful information in
memcached locally. Cache misses for me are really expensive, and the
digg API makes certain operations I want to perform somewhat difficult.
For example, the primary thing I wanted this app to do for me is
tell me when anyone responds to any comment I make on a digg
article. Basically, that looks like this:
1) Ask for any recent comments by username.
2) Ask for all of the stories to which any of these comments belong
so I can put useful titles on things.
3) Ask for any children comments of #1 (or children of the comments'
parent as defined by the old system).
As this is primarily used (at least by me) as an RSS provider, that
request occurs several times throughout the day and I'd like it to be
cached. However, I'd *also* like it to be fresh, and I don't get
notifications from digg.
I cache the result of #1 for about a minute -- fairly insignificant
amount of time, but I don't consider that request version expensive.
I cache the results from #2 for about five minutes. It's a single
request for up to something like 100 stories, and I can optimize some
of it out if I have some of the stories in my cache already.
#3 is the most expensive query, because I need to run it almost once
per comment (result of #1). I cache these for about a day, *but* the
key includes the number of comments on a given story (which I get in
the result of #2). If nobody's commented on a story at all, I can be
guaranteed that nobody's commented on a thread I'm involved in within
It's not perfect, but it's quite effective and greatly reduces the
number of trips to digg without having my latency drop below ~5 minutes.
Depends on your application, but don't think of it as working with
result sets as much as objects. I cache collections of pre-build
objects, and mash them together in my application code.
A neat benefit of doing things this way (going back to the long
answer above), is that understanding my data at this level allows me
to generate smarter etags such that the typical response sent to an
RSS reader from my app is 0 bytes (after headers).
More information about the memcached