Cache miss stampedes

Fri Jul 27 18:36:18 UTC 2007

Late to this party, but I have to mention Gearman here.

On a cache miss, instead of going to the database directly, issue a
Gearman request with a "uniq" property, then the Gearman server will
combine all the duplicate requests and only dispatch one worker.  The
worker than puts it in the cache before returning to the Gearman router
(gearmand), and then gearmand multiplexes the result back to all waiting
callers.

On Wed, 25 Jul 2007, dormando wrote:

> Hey,
>
> So I'm up late adding more crap to the memcached FAQ, and I'm wondering
> about a particular access pattern:
>
> - Key A is hit very often (many times per second).
> - Key A goes missing.
> - Several dozen processes all get a cache miss on A at the same time,
> then run SQL query/whatever, and try set or adding back into memcached.
>
> Sometimes this can be destructive to a database, and can happen often if
> the expire time on the data is low for some reason.
>
> What approaches do folks typically use to deal with this more elegantly?
> The better suggestion I've heard is to try to 'add' the key (or a
> separate 'lock' key) back into memcached, and only doing the query if
> you 'win' that lock. Everyone else microsleeps and retries a few times
> before running the query.
>
> Also in most of these cases you should really run a tiered cache, with
> this type of data being stored in a local cache and in memcached.
>
> This really isn't a common case, but sucks hard when it happens. In the
> back of my mind I envision a different style 'get' command, which
> defaults to a mutex operation on miss. So you'd do the special 'get',
> and if you get a special return code that says it's a miss but you're
> clear to update the data (which would release the lock?). Otherwise the
> command could optionally return immediately, or hang (for a while) until
> the data's been updated.
>
> Just throwing out ideas. Thoughts?
>
> -Dormando
>
>