tag proposal

Thu Oct 4 16:11:06 UTC 2007

Tobias Lütke wrote:
> This also means that the number of tags in the system will be quite
> large. There will be one or more tags for each row in the articles
> table. I expect the amount of tags to be vastly larger then the amount
> of keys in future memcached servers.
>   

Which is why I'm kind of skeptical about the whole tags thing, honestly. 
It seems like an optimization for the rare case (invalidation) at the 
expense of the vastly more common case (getting values by ID) by virtue 
of reducing the amount of memory available for keys and values. Fewer 
items in the cache equals lower hit rate.

Obviously different applications have different usage. I can tell you 
that in our application, gets outnumber deletes by at least two orders 
of magnitude across the board, and many of our objects are so small that 
any tag would likely eat more memory than the value being cached. (Not, 
perhaps, than the object header, but certainly more than the value.)

Also, invalidating a tag means broadcasting a "delete by tag" request to 
all the memcached servers since you have no way of knowing which servers 
have objects with which tags. For large sites with lots of memcached 
servers, or even medium-sized sites using the "run a memcached instance 
on each web host" approach, that means a ton of outgoing requests, 
almost all of which are likely to not invalidate anything at all if the 
tags are relatively sparse.

Not saying the feature isn't worth adding; there are doubtless valid use 
cases for it. But whatever implementation finally arrives, IMO, 
shouldn't impose any per-object memory overhead on objects that have no 
tags at all. Or if it does, it should be surrounded by #ifdef so that 
sites that don't need it don't see their available cache memory drop 
substantially when they upgrade.

-Steve