[Repost] Key Dependencies
clayton collie
gboyah at hotmail.com
Mon Mar 12 23:02:46 UTC 2007
My apologies for the bungled post (thanks to hotmail)
Anyhow...
I need the following in an ongoing project, so i'm posting here to see if
others see merit in the idea.
Proposal
--------
Add key-based dependencies to memcached.
Motivation
----------
To simplify common caching scenarios (data dependencies and namespaces)
by centralizing the logic at the server.
Example
---------
Assume we have an organization selling widgets to the Americas
and Asia, and that the sales calculations are aggregated and
cached.
add ns:americas 0 0 nothing\r\n
add ns:asia 0 0 nothing\r\n
add americas:sales 0 0 25000\r\n
add americas:sales.usa 0 0 20000\r\n
add americas:sales.usa.bolts 0 0 10000\r\n
add americas:sales.usa.grommets 0 0 10000\r\n
add americas:sales.colombia 0 0 5000\r\n
add americas:sales.colombia.bolts 0 0 5000\r\n
dependency americas:sales.usa americas:sales.usa.bolts
americas:sales.usa.grommets\r\n
dependency americas:sales.combia americas:sales.colombia.bolts\r\n
dependency americas:sales americas:sales.usa americas:sales.colombia\r\n
dependency americas:sales.usa ns:americas\r\n
dependency americas:sales.colombia ns:americas\r\n
Here we add keys representing the two sales regions (ns:america, ns:asia).
The total sales
per region are stored (e.g. americas:sales), as well as the individual
country and item sales
which constitute the total. The aggregate is made dependent on the country
sales, and
the country sales are dependent on the region. Any number of dependencies
can be added,
but of course this consumes memory per key.
If the sales figures for the USA or Colombia changes, americas:sales is
automatically
deleted. To delete the aggregate data for the Americas region, we delete
ns:americas.
Implementing namespaces in this way may cause issues in that a dependency
would
be one of the first items scavenged if there is no need to retrieve it
- like ns:americas in this example (there is no useful data associated).
The client can handle this with a multi-key get which always includes such a
key.
PHP example :
function GetEx($memcache, $key)
{
$pos = strpos($key, ":"); // we use ":" as namespace separator
if (FALSE == $pos) {
/* not namespaced, use single get */
return $memcache->get($key);
}
$namespace = "ns:" . substr($key, 0, $pos);
$arr = $memcache->get( array($key, $namespace) );
return $arr[$key];
}
Dependency command
--------------------------
dependency <key> <dependency>*\r\n
- <key> is the key under which the dependent item is stored.
- <dependency>* means one or more key strings separated by whitespace.
Each <dependency> must represent a valid undeleted, unexpired item.
Each item specified by the dependency key has <key> added to its list of
dependents.
An item cannot be added as a dependency to itself.
After sending the command line and the data block the client awaits
the reply, which may be:
- "OK\r\n", to indicate success.
- "NOT_FOUND\r\n" to indicate the item or one of the dependencies was was
not found.
Needed Changes
------------------------
- Addition of storage for dependent keys per item struct.
- void item_add_dependencies(conn *c, item *node, item **dependencies, int
count, int check_cycles)
Add node as a dependent to each item in the dependencies list. This
optionally performs a
topological sort to detect possible dependency cycles.
- Add a server setting to allow/disallow dependency cycle detection.
- Modification of the item_unlink routine to recursively unlink dependents
of
the item. When the dependency is deleted or modified, all of its
dependents
are deleted recursively.
- Modification of item_free to cleanup dependent key storage.
- An addition to the protocol to allow client setting of item dependencies.
Caveats
-------
This implementation assumes that a key and its dependencies are located in a
single memcached
server instance. Cycle detection has an effect performance, so it is made
optional by setting.
Dependents of a key are stored per node and consume (2 + strlen(key)) bytes
each.
Any thoughts ?
clayton collie
More information about the memcached
mailing list