tag proposal

Dustin Sallings dustin at spy.net
Thu Oct 4 18:37:59 UTC 2007


On Oct 4, 2007, at 9:11 , Steven Grimm wrote:

> Tobias Lütke wrote:
>> This also means that the number of tags in the system will be quite
>> large. There will be one or more tags for each row in the articles
>> table. I expect the amount of tags to be vastly larger then the  
>> amount
>> of keys in future memcached servers.
>
> Which is why I'm kind of skeptical about the whole tags thing,  
> honestly. It seems like an optimization for the rare case  
> (invalidation) at the expense of the vastly more common case  
> (getting values by ID) by virtue of reducing the amount of memory  
> available for keys and values. Fewer items in the cache equals  
> lower hit rate.

	There are only a large number of tags if you create a large number  
of tags.

> Obviously different applications have different usage. I can tell  
> you that in our application, gets outnumber deletes by at least two  
> orders of magnitude across the board, and many of our objects are  
> so small that any tag would likely eat more memory than the value  
> being cached. (Not, perhaps, than the object header, but certainly  
> more than the value.)

	I would hope that it'd generally be the case that deletes aren't  
common.  I'm hoping that tags aren't going to encourage people to  
delete *more*, but to delete more accurately.

> Also, invalidating a tag means broadcasting a "delete by tag"  
> request to all the memcached servers since you have no way of  
> knowing which servers have objects with which tags. For large sites  
> with lots of memcached servers, or even medium-sized sites using  
> the "run a memcached instance on each web host" approach, that  
> means a ton of outgoing requests, almost all of which are likely to  
> not invalidate anything at all if the tags are relatively sparse.

	It's a lot of requests rarely.  Broadcast isn't particularly  
expensive in my client, but I certainly can see how it is for others.

	It comes down to measurements, I suppose.  If tags help, then it'll  
be useful.

> Not saying the feature isn't worth adding; there are doubtless  
> valid use cases for it. But whatever implementation finally  
> arrives, IMO, shouldn't impose any per-object memory overhead on  
> objects that have no tags at all. Or if it does, it should be  
> surrounded by #ifdef so that sites that don't need it don't see  
> their available cache memory drop substantially when they upgrade.

	I was imagining the overhead being something like 8 bytes per item  
on a 32-bit system as well as the tag hash table.

-- 
Dustin Sallings




More information about the memcached mailing list