linuxjournal article on memcached

Brad Fitzpatrick brad@danga.com
Fri, 16 Jul 2004 12:02:58 -0700 (PDT)


On Fri, 16 Jul 2004, Perrin Harkins wrote:

> On Fri, 2004-07-16 at 14:13, Brad Fitzpatrick wrote:
> > So say the application NEEDS to do a delete, otherwise the cache would be
> > out of sync with the database.  But the memcached's network is unplugged,
> > so the delete can't happen.  Then, the client tells one of the multiple
> > bucket managers, "Yo, I couldn't fix up bucket 387, so next time you see
> > it, wipe the entire bucket, and assign that bucket to another host and
> > give me the list of who owns that bucket now."
>
> When I've worked on distributed caches in the past, we always just wiped
> any cache server that was off-line for any period of time in order to
> prevent it from serving stale data when it comes back up.  It means you
> have to rebuild some cache entries, but it's simple and safe.  However,
> it does assume you actually know about it when this happens.  If you
> frequently have transient network problems, you would need something
> like the approach you're describing here.

It doesn't matter if /we/ have transient network problems.  It matters if
/any/ person using memcached does.  (I'm sure that's what you meant, but I
wanted to clarify for others)

People shouldn't have to worry about it.  Also, the advantage to wiping
just a bucket instead of a whole cache should be obvious.  Say we need to
move a server from one switch to another:  we yank the cable and move it
and it's back online somewhere else almost immediately.  But the client
code times a server out after 0.25 seconds or whatever, so it loses that
one update.  Maybe that happened 20 times from different clients, so that
node has to wipe 20 of its virtual buckets instead of all 3,000 or
whatever.

One of these days I'll shut up and code and get it done.

My partner in crime (Avva) has been doing his math thesis, though, so I've
put less work into memcached lately, since it pretty much just works as
is.

- Brad