[PATCH] Soft expiration support to prevent from cache miss stampedes.

Olivier Poitrey rs at dailymotion.com
Sat Jun 14 19:32:24 UTC 2008


Josef,

As I answered off-list as well, I totally agree, there is other  
solutions to fix this same issue. This one have pros and cons, and one  
of the cons is to add a very small code on the server side executed at  
each gets. On the other side, the main pros of this technic, is that  
it doesn't require any change on the client side, you activate it and  
the problem is immediately solved. It could interest some people, it's  
why I shared this patch with you guys.

Cheers,

On 14 juin 08, at 19:18, Josef Finsel wrote:

> Olivier,
>
> Sorry, I just realized I was only responding to you, not the list as  
> well.
>
> My thoughts are that this is a good patch but I'm not sure it's  
> appropriate for the server. There are many ways of handling this on  
> the client side. The biggest concern I have with it is that it would  
> add additional processing on every get. Not a great deal, perhaps  
> but every little bit adds up over time, especially on larger sites.
>
> If you're concern is related to cache stampedes, there are many ways  
> to avoid them on the client. And you can even implement code to  
> handle refreshing data before it expires so that it only affects one  
> client instance.
>
> But that's just my thoughts on this.
>
> Josef
>
> "If you see a whole thing - it seems that it's always beautiful.  
> Planets, lives... But up close a world's all dirt and rocks. And day  
> to day, life's a hard job, you get tired, you lose the pattern."
> Ursula K. Le Guin
>
> On Sat, Jun 14, 2008 at 10:37 AM, Olivier Poitrey  
> <rs at dailymotion.com> wrote:
> If a key is very popular in the cache, when this key will expire,
> a lot of different clients will get an expired answer from the cache
> at the same time and they will gather the data from the source all
> at the same time. This isn't efficient, especially when the  
> computation
> of the information is expensive and can takes some time.
>
> This patch adds an argument to memcached to activate soft expiration
> period to keys with expiration. For instance, if the soft timeout is
> set to 30 seconds and a key have a TTL of 50 seconds, after 20 seconds
> of life in the cache, memcached will start to return empty answers for
> this key for a small part of the queries. The more the key will be  
> close
> to its real expiration time, the more important will be the  
> propability
> for memcached to return an empty answer.
>
> This way, if a key is very popular, a small number of clients will
> see it as expired while every others will see its value as normal.
> This small number of clients will gather the data from the source
> and update the key value and chances are they will finish this
> before the real key expiration time.
>
> NOTE: this soft timeout is a server wide value, future version
> of this patch could extend the protocol in order to let clients
> choose this value on a per key basis.
> ---
>  items.c     |    7 +++++++
>  memcached.c |   14 +++++++++++++-
>  memcached.h |    1 +
>  3 files changed, 21 insertions(+), 1 deletions(-)
>
> diff --git a/items.c b/items.c
> index 88c92f6..b15325d 100644
> --- a/items.c
> +++ b/items.c
> @@ -434,6 +434,13 @@ item *do_item_get_notedeleted(const char *key,  
> const size_t nkey, bool *delete_l
>         it = NULL;
>     }
>
> +    if (it != NULL && settings.soft_timeout > 0 && it->exptime != 0  
> &&
> +        (it->exptime - current_time) <= settings.soft_timeout) {
> +        if (rand() % settings.soft_timeout >= (it->exptime -  
> current_time)) {
> +            it = NULL;
> +        }
> +    }
> +
>     if (it != NULL) {
>         it->refcount++;
>         DEBUG_REFCNT(it, '+');
> diff --git a/memcached.c b/memcached.c
> index 27456be..6efb47d 100644
> --- a/memcached.c
> +++ b/memcached.c
> @@ -177,6 +177,7 @@ static void settings_init(void) {
>     settings.managed = false;
>     settings.factor = 1.25;
>     settings.chunk_size = 48;         /* space for a modest key and  
> value */
> +    settings.soft_timeout = 0;        /* number of seconds before  
> hard timeout the soft timeout will start */
>  #ifdef USE_THREADS
>     settings.num_threads = 4;
>  #else
> @@ -1065,6 +1066,7 @@ static void process_stat(conn *c, token_t  
> *tokens, const size_t ntokens) {
>         pos += sprintf(pos, "STAT bytes_written %llu\r\n",  
> stats.bytes_written);
>         pos += sprintf(pos, "STAT limit_maxbytes %llu\r\n",  
> (uint64_t) settings.maxbytes);
>         pos += sprintf(pos, "STAT threads %u\r\n",  
> settings.num_threads);
> +        pos += sprintf(pos, "STAT soft_timeout %d\r\n",  
> settings.soft_timeout);
>         pos += sprintf(pos, "END");
>         STATS_UNLOCK();
>         out_string(c, temp);
> @@ -2671,6 +2673,13 @@ static void usage(void) {
>            "-P <file>     save PID in <file>, only used with -d  
> option\n"
>            "-f <factor>   chunk size growth factor, default 1.25\n"
>            "-n <bytes>    minimum space allocated for key+value 
> +flags, default 48\n"
> +           "-S <num>      Number of seconds before the hard expires  
> to start using\n"
> +           "              soft expiration. When an item is in soft  
> expiration status,\n"
> +           "              memcached will tell to SOME clients the  
> item is expired before\n"
> +           "              its actual expiration. The more the item  
> is close to its actual expiration\n"
> +           "              time, the more the probability that  
> memcache tell a client the item is\n"
> +           "              expired. This allow smooth cache refill  
> when a popular key is about to\n"
> +           "              expire.\n"
>
>  #if defined(HAVE_GETPAGESIZES) && defined(HAVE_MEMCNTL)
>            "-L            Try to use large memory pages (if  
> available). Increasing\n"
> @@ -2861,7 +2870,7 @@ int main (int argc, char **argv) {
>     setbuf(stderr, NULL);
>
>     /* process arguments */
> -    while ((c = getopt(argc, argv,  
> "a:bp:s:U:m:Mc:khirvdl:u:P:f:s:n:t:D:L")) != -1) {
> +    while ((c = getopt(argc, argv,  
> "a:bp:s:U:m:Mc:khirvdl:u:P:f:s:n:t:D:S:L")) != -1) {
>         switch (c) {
>         case 'a':
>             /* access for unix domain socket, as octal mask (like  
> chmod)*/
> @@ -2952,6 +2961,9 @@ int main (int argc, char **argv) {
>             }
>             break;
>  #endif
> +        case 'S':
> +            settings.soft_timeout = atoi(optarg);
> +            break;
>         default:
>             fprintf(stderr, "Illegal argument \"%c\"\n", c);
>             return 1;
> diff --git a/memcached.h b/memcached.h
> index ffbe880..ae11c72 100644
> --- a/memcached.h
> +++ b/memcached.h
> @@ -96,6 +96,7 @@ struct settings {
>     int num_threads;        /* number of libevent threads to run */
>     char prefix_delimiter;  /* character that marks a key prefix  
> (for stats) */
>     int detail_enabled;     /* nonzero if we're collecting detailed  
> stats */
> +    int soft_timeout;
>  };
>
>  extern struct stats stats;
> --
> 1.5.4.4
>
>

-- 
Olivier Poitrey
Co-Founder & CTO
Dailymotion SA






More information about the memcached mailing list