[PATCH] Soft expiration support to prevent from cache miss stampedes.

dormando dormando at rydia.net
Sun Jun 15 05:36:00 UTC 2008


I've had this discussion a few times with Domas... wasn't able to figure
out a way to make it really work right. It's better to embed soft
timeouts in your data object and use random numbers...

-Dormando

Josef Finsel wrote:
> Olivier,
> 
> Sorry, I just realized I was only responding to you, not the list as well.
> 
> My thoughts are that this is a good patch but I'm not sure it's
> appropriate for the server. There are many ways of handling this on the
> client side. The biggest concern I have with it is that it would add
> additional processing on every get. Not a great deal, perhaps but every
> little bit adds up over time, especially on larger sites.
> 
> If you're concern is related to cache stampedes, there are many ways to
> avoid them on the client. And you can even implement code to handle
> refreshing data before it expires so that it only affects one client
> instance.
> 
> But that's just my thoughts on this.
> 
> Josef
> 
> "If you see a whole thing - it seems that it's always beautiful.
> Planets, lives... But up close a world's all dirt and rocks. And day to
> day, life's a hard job, you get tired, you lose the pattern."
> Ursula K. Le Guin
> 
> On Sat, Jun 14, 2008 at 10:37 AM, Olivier Poitrey <rs at dailymotion.com
> <mailto:rs at dailymotion.com>> wrote:
> 
>     If a key is very popular in the cache, when this key will expire,
>     a lot of different clients will get an expired answer from the cache
>     at the same time and they will gather the data from the source all
>     at the same time. This isn't efficient, especially when the computation
>     of the information is expensive and can takes some time.
> 
>     This patch adds an argument to memcached to activate soft expiration
>     period to keys with expiration. For instance, if the soft timeout is
>     set to 30 seconds and a key have a TTL of 50 seconds, after 20 seconds
>     of life in the cache, memcached will start to return empty answers for
>     this key for a small part of the queries. The more the key will be close
>     to its real expiration time, the more important will be the propability
>     for memcached to return an empty answer.
> 
>     This way, if a key is very popular, a small number of clients will
>     see it as expired while every others will see its value as normal.
>     This small number of clients will gather the data from the source
>     and update the key value and chances are they will finish this
>     before the real key expiration time.
> 
>     NOTE: this soft timeout is a server wide value, future version
>     of this patch could extend the protocol in order to let clients
>     choose this value on a per key basis.
>     ---
>      items.c     |    7 +++++++
>      memcached.c |   14 +++++++++++++-
>      memcached.h |    1 +
>      3 files changed, 21 insertions(+), 1 deletions(-)
> 
>     diff --git a/items.c b/items.c
>     index 88c92f6..b15325d 100644
>     --- a/items.c
>     +++ b/items.c
>     @@ -434,6 +434,13 @@ item *do_item_get_notedeleted(const char *key,
>     const size_t nkey, bool *delete_l
>             it = NULL;
>         }
> 
>     +    if (it != NULL && settings.soft_timeout > 0 && it->exptime != 0 &&
>     +        (it->exptime - current_time) <= settings.soft_timeout) {
>     +        if (rand() % settings.soft_timeout >= (it->exptime -
>     current_time)) {
>     +            it = NULL;
>     +        }
>     +    }
>     +
>         if (it != NULL) {
>             it->refcount++;
>             DEBUG_REFCNT(it, '+');
>     diff --git a/memcached.c b/memcached.c
>     index 27456be..6efb47d 100644
>     --- a/memcached.c
>     +++ b/memcached.c
>     @@ -177,6 +177,7 @@ static void settings_init(void) {
>         settings.managed = false;
>         settings.factor = 1.25;
>         settings.chunk_size = 48;         /* space for a modest key and
>     value */
>     +    settings.soft_timeout = 0;        /* number of seconds before
>     hard timeout the soft timeout will start */
>      #ifdef USE_THREADS
>         settings.num_threads = 4;
>      #else
>     @@ -1065,6 +1066,7 @@ static void process_stat(conn *c, token_t
>     *tokens, const size_t ntokens) {
>             pos += sprintf(pos, "STAT bytes_written %llu\r\n",
>     stats.bytes_written);
>             pos += sprintf(pos, "STAT limit_maxbytes %llu\r\n",
>     (uint64_t) settings.maxbytes);
>             pos += sprintf(pos, "STAT threads %u\r\n",
>     settings.num_threads);
>     +        pos += sprintf(pos, "STAT soft_timeout %d\r\n",
>     settings.soft_timeout);
>             pos += sprintf(pos, "END");
>             STATS_UNLOCK();
>             out_string(c, temp);
>     @@ -2671,6 +2673,13 @@ static void usage(void) {
>                "-P <file>     save PID in <file>, only used with -d
>     option\n"
>                "-f <factor>   chunk size growth factor, default 1.25\n"
>                "-n <bytes>    minimum space allocated for
>     key+value+flags, default 48\n"
>     +           "-S <num>      Number of seconds before the hard expires
>     to start using\n"
>     +           "              soft expiration. When an item is in soft
>     expiration status,\n"
>     +           "              memcached will tell to SOME clients the
>     item is expired before\n"
>     +           "              its actual expiration. The more the item
>     is close to its actual expiration\n"
>     +           "              time, the more the probability that
>     memcache tell a client the item is\n"
>     +           "              expired. This allow smooth cache refill
>     when a popular key is about to\n"
>     +           "              expire.\n"
> 
>      #if defined(HAVE_GETPAGESIZES) && defined(HAVE_MEMCNTL)
>                "-L            Try to use large memory pages (if
>     available). Increasing\n"
>     @@ -2861,7 +2870,7 @@ int main (int argc, char **argv) {
>         setbuf(stderr, NULL);
> 
>         /* process arguments */
>     -    while ((c = getopt(argc, argv,
>     "a:bp:s:U:m:Mc:khirvdl:u:P:f:s:n:t:D:L")) != -1) {
>     +    while ((c = getopt(argc, argv,
>     "a:bp:s:U:m:Mc:khirvdl:u:P:f:s:n:t:D:S:L")) != -1) {
>             switch (c) {
>             case 'a':
>                 /* access for unix domain socket, as octal mask (like
>     chmod)*/
>     @@ -2952,6 +2961,9 @@ int main (int argc, char **argv) {
>                 }
>                 break;
>      #endif
>     +        case 'S':
>     +            settings.soft_timeout = atoi(optarg);
>     +            break;
>             default:
>                 fprintf(stderr, "Illegal argument \"%c\"\n", c);
>                 return 1;
>     diff --git a/memcached.h b/memcached.h
>     index ffbe880..ae11c72 100644
>     --- a/memcached.h
>     +++ b/memcached.h
>     @@ -96,6 +96,7 @@ struct settings {
>         int num_threads;        /* number of libevent threads to run */
>         char prefix_delimiter;  /* character that marks a key prefix
>     (for stats) */
>         int detail_enabled;     /* nonzero if we're collecting detailed
>     stats */
>     +    int soft_timeout;
>      };
> 
>      extern struct stats stats;
>     --
>     1.5.4.4 <http://1.5.4.4>
> 
> 



More information about the memcached mailing list