Notes from the fourth memcached hackathon of undetermined frequency
Dustin Sallings
dustin at spy.net
Wed Apr 16 16:52:20 UTC 2008
Last night was pretty good. As far as I know, we made progress in
two areas:
* multi-storage interface
* finalizing the last complaints of the binary protocol
# Multi Storage Interface
As for the multi-storage interface, Toru showed us some decent code
and we talked a bit about how it works and possibly can work.
Any lock around data storage belongs in the engine. The core server
will not enforce any kind of locking. There are also locks around the
connection data structure pool which nobody really cares about (if
it's a performance problem, don't connect as much), and some locks
around stats outside of the engine.
Stats are ugly. Some *must* live outside of the engine (e.g. rusage)
and some *must* live inside of the engine (e.g. curr_items). An
engine will have an engine-stats specific function that takes a
context and a callback and makes one callback per key/value pair of
engine-specific stats as it sees appropriate based on the context
(which includes a parameter to satisfy item stats requests).
Possibly more happened here, but it was late and few people slept
much the night before. Someone please fill in more detail. :)
# Binary Protocol
Symmetry is important and was a major goal here. There were two
dimensions of asymmetry we rectified (comparison to the text protocol
and the packet structure itself). Don't stop reading yet, though,
because details follow:
## Command Equality
Four commands were missing altogether from the binary spec that had
made it into the text protocol, and one was missing some functionality
that seemed ridiculous, but someone, somewhere uses:
* quit (Trond sent a patch)
* prepend
* append
* flush's time bomb
* stats (are ugly)
The first three are obvious.
In the text protocol, flush_all takes a parameter that causes the
server to behave normally for a while and then suddenly drop the cache
after a bit of time. I think this confuses people, but if we can't
remove it from one protocol, we can't omit it from the other.
Stats are ugly and required a lot of thought. We tried to hold of on
designing a mechanism as long as we could (Trond suggested we wait
for, you know, someone to care), but in the end, we decided to do
something like an implied multi-get since Dormando claims to care a
lot. Enough to implement it anyway. :)
A stats command is issued with a single string parameter, and the
server returns multiple responses, each containing a key, and a string
value. A terminating packet indicates the server has nothing more to
say. [We didn't really talk about the details of this, but I'd
recommend terminating with a stat with a 0 length key and 0 length
value.]
## New Commands
Brian made his case for get returning a key (sometimes). We solved
this problem by adding a get command that returns a key. I'll call it
getk and hopefully remember why after I hit send. So there are now
*four* get commands in the binary protocol:
* get (returns an object or error; does not return key)
* getq (returns an object; does not return missing; does not return
key)
* getk (returns an object with key or error)
* getkq (returns an object with key; does not return missing)
We also defined semantics for a setq command:
* setq (does not return in the normal case, opaque identifies
failures)
These commands do not currently have command IDs reserved.
## Packet Header
Brian pointed out that CAS as I imagined it left him with a feeling
of incompleteness that could only by resolved by putting CAS on
everything. Set now returns a CAS ID (of the newly created object),
delete honors a CAS ID (delete the object with this key iff it has
this value).
We accomplished this by moving the CAS identifier up into the
standard header in both directions. Yes, you can now request a
version with a CAS and you'll get the version back with a CAS.
Hopefully nobody decides that the value is important here and should
be honored.
We also put key length back into the response header (by eating the
reserved at byte 6). Any response may now include a key and every
packet sniffer should be happy to see it (note that this makes the
implementations of getk and stat really obvious, too).
While messing around with the header, we swapped the location of this
new key length in the response and the status. With this change, we
have the same packet structure for both directions with the exception
of the request reserve being used as a response status.
# Other Items
Brian gets paid in strange ways.
--
Dustin Sallings
More information about the memcached
mailing list