tag proposal

Clint Webb webb.clint at gmail.com
Thu Oct 4 06:48:28 UTC 2007


Sounds good.

I personally do not see the need to use tags in any other way except to
easily remove (invalidate?) keys that have a particular tag.  Being able to
do other things like retrieve all keys that have a tag, could be useful, but
I can see it as complicating things.

First point.  When you say 'invalidate_tag' I assume you mean to invalidate
all keys that have that tag.

Second point.  You ask the question about refcounts on tags, I think it
should, or at the very least provide a command that will tell the cache to
remove all tags that are no longer referenced by a key.  Refcount is
probably easier, and just remove the tag when it gets to zero.  Decrementing
it when a key is deleted or expired.   Otherwise, memory used by tags will
always keep getting bigger as keys drop out due to LRU.   An example,
article 50 is added to the cache, and tags representing that article ID have
been added to a bunch of keys.  Over time, article 50 no longer has much
visibility and falls out of the cache due to LRU.  If no more keys are in
the cache for the tag, I think the tag should go.

Third point.  I am also assuming that we can assign more than one tag to a
key.  If I could only add one tag to a key, then that would limit its
usefulness to me.  I see you mention pointers (and not just pointer) so I am
sure that this is correct, just clarifying.

On 10/4/07, Dustin Sallings <dustin at spy.net> wrote:
>
>
> Tags seem to be getting hot and lots of people have talked about it fairly
> abstractly.  I wanted to try to bring some of those together with respect to
> a memcached implementation and sit back and watch it all happen.  :)
> Firstly, I think there are two new commands to implement tags:
>
> 1)  add_tag (key, tag_name)
> 2)  invalidate_tag (tag_name)
>
> I don't think there's a need for tag inspection for a given object.  There
> is *definitely* no command to search by tag.
>
>
> All of the actual tags (text) would exist in a global hash table whose
> value is a generation number.
>
> [It's unclear whether it's worth the effort to ever release a tag once
> it's been added.  If we assume that tags live forever, we don't have to
> refcount them and a few things get easier.  Any opinions?]
>
>
> A single global generation number is used to track invalidation events.
>
> Each cache item contains a space for pointers to tags with their
> individual generation numbers and a local generation number.
>
> When a tag is added to an item, the global generation number is copied
> into the item's local generation number (if it's not set), and the tag space
> is extended to point to the tag key at its current individual generation.
>
> Adding an existing tag to an item must not cause any modification to the
> item (i.e. check first).
>
>
> Invalidation of a tag would basically be a ``global_generation =
> ++tags[tag]'' kind of operation.
>
> Each time an item is requested from a cache, the local generation number
> is compared against the global generation number.  If it differs, each tag
> is checked to ensure the tag generation number equals the number stored for
> that tag.
>
> If they're all the same, the local generation number is set to the global
> generation number.
>
> If they're different, this record doesn't exist.
>
>
>
> I've secretly left a lot of holes in this concept as a puzzle to the
> reader.  Three units of cool to each person who finds one.
>
> --
> Dustin Sallings
>
>
>


-- 
"Be excellent to each other"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20071004/b442b19c/attachment.html


More information about the memcached mailing list