Notes from the fourth memcached hackathon of undetermined frequency

Dustin Sallings dustin at spy.net
Wed Apr 16 16:52:20 UTC 2008


	Last night was pretty good.  As far as I know, we made progress in  
two areas:

	* multi-storage interface
	* finalizing the last complaints of the binary protocol

# Multi Storage Interface

	As for the multi-storage interface, Toru showed us some decent code  
and we talked a bit about how it works and possibly can work.

	Any lock around data storage belongs in the engine.  The core server  
will not enforce any kind of locking.  There are also locks around the  
connection data structure pool which nobody really cares about (if  
it's a performance problem, don't connect as much), and some locks  
around stats outside of the engine.

	Stats are ugly.  Some *must* live outside of the engine (e.g. rusage)  
and some *must* live inside of the engine (e.g. curr_items).  An  
engine will have an engine-stats specific function that takes a  
context and a callback and makes one callback per key/value pair of  
engine-specific stats as it sees appropriate based on the context  
(which includes a parameter to satisfy item stats requests).

	Possibly more happened here, but it was late and few people slept  
much the night before.  Someone please fill in more detail.  :)

# Binary Protocol

	Symmetry is important and was a major goal here.  There were two  
dimensions of asymmetry we rectified (comparison to the text protocol  
and the packet structure itself).  Don't stop reading yet, though,  
because details follow:

## Command Equality

	Four commands were missing altogether from the binary spec that had  
made it into the text protocol, and one was missing some functionality  
that seemed ridiculous, but someone, somewhere uses:

	* quit (Trond sent a patch)
	* prepend
	* append
	* flush's time bomb
	* stats (are ugly)

	The first three are obvious.

	In the text protocol, flush_all takes a parameter that causes the  
server to behave normally for a while and then suddenly drop the cache  
after a bit of time.  I think this confuses people, but if we can't  
remove it from one protocol, we can't omit it from the other.

	Stats are ugly and required a lot of thought.  We tried to hold of on  
designing a mechanism as long as we could (Trond suggested we wait  
for, you know, someone to care), but in the end, we decided to do  
something like an implied multi-get since Dormando claims to care a  
lot.  Enough to implement it anyway.  :)

	A stats command is issued with a single string parameter, and the  
server returns multiple responses, each containing a key, and a string  
value.  A terminating packet indicates the server has nothing more to  
say.  [We didn't really talk about the details of this, but I'd  
recommend terminating with a stat with a 0 length key and 0 length  
value.]

## New Commands

	Brian made his case for get returning a key (sometimes).  We solved  
this problem by adding a get command that returns a key.  I'll call it  
getk and hopefully remember why after I hit send.  So there are now  
*four* get commands in the binary protocol:

	* get   (returns an object or error; does not return key)
	* getq  (returns an object; does not return missing; does not return  
key)
	* getk  (returns an object with key or error)
	* getkq (returns an object with key; does not return missing)

	We also defined semantics for a setq command:

	* setq  (does not return in the normal case, opaque identifies  
failures)

	These commands do not currently have command IDs reserved.

## Packet Header

	Brian pointed out that CAS as I imagined it left him with a feeling  
of incompleteness that could only by resolved by putting CAS on  
everything.  Set now returns a CAS ID (of the newly created object),  
delete honors a CAS ID (delete the object with this key iff it has  
this value).

	We accomplished this by moving the CAS identifier up into the  
standard header in both directions.  Yes, you can now request a  
version with a CAS and you'll get the version back with a CAS.   
Hopefully nobody decides that the value is important here and should  
be honored.

	We also put key length back into the response header (by eating the  
reserved at byte 6).  Any response may now include a key and every  
packet sniffer should be happy to see it (note that this makes the  
implementations of getk and stat really obvious, too).

	While messing around with the header, we swapped the location of this  
new key length in the response and the status.  With this change, we  
have the same packet structure for both directions with the exception  
of the request reserve being used as a response status.

# Other Items

	Brian gets paid in strange ways.

-- 
Dustin Sallings



More information about the memcached mailing list