Protcol Additions

Andy memcached at thwartedefforts.org
Thu Nov 17 12:01:35 PST 2005


On Thu, 2005-11-17 at 18:28 +0100, Marcel Holan wrote:

> I see no problem in having this feature present with a O(n) runtime
> complexity, as all alternative and viable ways @ clientside have O(n) also,
> but add much more cruft to the code.

While this may be true, O(n) complexity in the server is different than
O(n) complexity in the client.  Since memcached is not threaded, then
performing an O(n) operation on the server makes that server unusable
for the duration of performing the operation.  Doing the O(n) operation
from the client (using multiple O(1) queries in the server) does not
keep that memcached server from fielding other requests while the
operation is being performed.

Now, admittedly, actually performing the O(n) operation on the client
would be a lot easier if the client had a way to get a list of all the
keys, but that, unfortunately, would also be an O(n) operation that gets
performed on the server (and would be an O(n) operation on the client
for n memcached servers).

What if we think about the problem differently.  The original problem:

        We write affiliate management software and are currently adding
        support for memcached throughout the system. Our system is
        extremely configurable allowing the admin to set different
        commission structures based on the site, option of the site and
        affiliate. I cache these payout options in memcached to not have
        to query from the database constantly since I need them very
        often. I use a key like this: payout_SITEID_OPTIONID_AFFILIATEID
        
        Now, if I had 5 sites, 2 options per site and 1000 affiliates
        that would be 18018 different keys right there (they could all
        be 0 also to use the default).
        
        In the worst case, if an admin changes the base payout I would
        have to go through all possibilities to delete the key from
        memcached, I do not want to let them expire because the changing
        of payouts does not happen often so it makes more sense to
        delete the keys when they are changed.

What about the following method?

        Store the "last update time" as a key in memcached, and before
        you get the payout_$SITEID_$OPTIONID_$AFFILIATEID values,
        retrieve that value and use it to build an actual keyname that
        the payout is stored under, like payout_$SITEID_$OPTIONID_
        $AFFILIATEID_$LASTUPDATE.  Set the expire time of the payout
        keys based on, perhaps, some average of the frequency of updates
        of the payout values.
        
        When updating payout values in the database, change the value of
        the "last update time" key in memcached.  All the old values
        will then not be referenced any more and will eventually get
        cycled out.

But I think this might be too complicated (and I have not actually used
this method, but I think there's a problem with I can not quite put my
finger on), and I think there a simpler method...

Since the act of updating the payouts is relatively rare, then that is
when the data in memcached should be updated.  And you already know what
has changed, so you know the keyname to update/replace with the new
value.

In fact, the actual number of payout keys is finite and has an
upperbound of (sites * options * affiliates).  If you are changing the
baserate for everything, perform (sites * options * affiliates) deletes
(which will be a noop if the key doesn't exist) or recalculate and
update them.  This can be optimized if you know exactly what you are
changing; if one affiliate's payout changes, then you only have to
perform (sites * options) updates/deletes.  You can just iterate over
all the sites, options and affiliates when you need to.  This can happen
asynchronously or in another thread. 

No matter what you do, a lot of keys will need to be iterated.  Having a
wildcard delete command or a list all keys command makes the memcached
server do the iteration over all keys.  I think this is the wrong place
to do it, considering what the intent of memcached is.  You can have the
client do the iteration to update/replace/delete at update time because
you know all the keys that should exist (presumably sites, options and
affiliates are fully listable from some kind of backing-store/database).

-- 
Andy <memcached at thwartedefforts.org>



More information about the memcached mailing list