Listing Keys in Cache ??

Mon Sep 10 14:09:38 UTC 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Todd Fisher wrote:
> Hi all,
> 
>   I've been working on implementing a ESI (Edge Side Include) Proxy caching
> server.   The idea is very similar to SSI (Server Side Include), except the
> includes are remote instead of local.
> 
> There's a very nice write up of using SSI in this context with nginx and
> memcached here =>
> http://blog.kovyrin.net/2007/08/05/using-nginx-ssi-and-memcache-to-make-your-web-applications-faster/

Cool idea:  I was the lead for the project which funded adding ESI to
squid3, and I'm in favor of multiple implementations.

> My project is mongrel-esi (http://code.google.com/p/mongrel-esi) and I'd
> like to use it as an example of how one can use memcache and still
> invalidate URLs using regular expressions.
> 
> The basic idea is to store the regular expressions that have been picked up
> by processing an invalidation rule in memcache.   Before retrieving a key
> from memcache checking if the key matches any of the regular expressions in
> the invalidation-key and that the object being cached is older then the
> invalidation rule that matched.   We can probably be clever about how we
> name the invalidation-key so that it too can expire.   So long as the
> invalidation-key doesn't expire before a predetermined max-expiration time
> that all keys get.  Then we can be certain that either before we request a
> key, or the keys expiration time the objects marked to expire by the regex
> will expire as requested by surrogate.
> 
> For me this realization is great because it means I can finish implementing
> the proxy server using memcached as the cache storage.   Hopefully, it also
> will be a helpful solution for others looking to access all the keys.   And
> of course, I may  be  missing something important, so please correct me if I
> have.
> 
> With this solution I think it's important to get the clever part about
> making sure the expire rules have an expire time on them that is greater
> then or equal to the expire time of all keys.  There may also be some ways
> to name the keys based on the specific domain to have more then one
> invalidation key to avoid letting it grow into a large dataset that needs to
> be retrieved everytime.   It also means you have two hits on memcache per
> key lookup instead of the one before...   There are probably other
> techniques for solving this problem...  Just hoping it makes sense and is
> useful to others...

I'm puzzled:  why don't you store the fragments into memcache using
expiration times which correspond to the HTTP cache headers which
wrapped them?  One of the reasons I was excited about ESI was that I
could cache different sections of the page with different policies,
which meant I could drop the whole idea of purging pages from the Squid
cache:  the "hot" sections of the page would just time out naturally.
Most pages themselves would have *long* expiration times, in this
scenario (e.g., 1 day);   a given zone on the page (e.g., "top 5 news
stories") might have a very short expiration (e.g., 30 seconds).

Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG5VAi+gerLs4ltQ4RAvheAJ4zpfI6sdcpRm7ZMZDfE08DuNl9UgCeLwXI
ivd4FE5pyvC4isPO7/OtVcw=
=6TD/
-----END PGP SIGNATURE-----