Binary Protocol...

Sean Chittenden sean at chittenden.org
Wed Dec 8 14:03:25 PST 2004


Howdy all.  In the interests of feature growth and moving away from the  
convenient, but rather expensive text protocol, I'd like to propose the  
binary memcache protocol.  If you haven't ever worked with binary  
protocols, please refrain from the technical aspects of this discussion  
as the potential for a bikeshed[1] is huge.  If you're not seeing the  
feature you're interested in (with the exception of HTTP protocol  
support!), please drop me a line offlist that way I can add it  
accordingly.  To those innocent by standards on the list who don't care  
one way or another, I apologize in advance if this is of no interest.   
Lastly, because there is protocol support for a feature does *not* mean  
that the feature will be present when this is implemented, nor that the  
feature will be added ever.  Thanks.

[1]	Please refer to:
		http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/ 
misc.html#BIKESHED-PAINTING
		http://www.freebsd.org/doc/en_US.ISO8859-1/articles/mailing-list-faq/ 
bikeshed.html


The memcache(4) binary protocol:

The memcache(4) binary protocol consists of a few basic packets that  
accomplish all of the current functionality and adds new functionality  
to memcached(8)'s feature set.  One of the large goals of this exercise  
is to maintain the one packet per trip.  In a few cases, most notably  
error handling, this falls apart, but for all intents and purposes,  
this should work rather well.  With the advent of libmemcache(3) as a  
library that can be used in any context, portability problems or  
different handling between language bindings should be minimized.   
Also, since libmemcache(3) is written in C, it's possible for faster  
ways of sending packets to be achieved, such as the use of T/TCP on  
FreeBSD (or the use of SCTP, which seems to be the next generation of  
the T/TCP concept).  If you're familiar with routing protocols, this  
will look rather familiar.


The HELLO Packet:

The HELLO Packet must be sent by the client when the connection is  
established.  Other packets should be piggy backed on the tail of the  
HELLO packet that way only one TCP packet (and ideally only one  
ehternet frame) is sent across the network.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Version    |    Options    |  User Length  | Passwd Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Key Space ID                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                           Username                            /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                           Password                            /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Version (required):
	8 bit protocol version.  The high bit is always set as a way
	of distinguishing between the binary protocol and text protocol.
	127 different versions should be sufficient.

Options (required):
	These bits refer to the bits in the Options Byte.

	Bit 0:	Connection provides authentication information
	Bit 1:	This client connection requires TLS
	Bit 2:	Disconnect if TLS can not be negotiated
	Bit 3-7:	Not designated

Username Length (required):
	The length of the username.  Username lengths limited to 256 bytes.   
Username
	and passwords are left empty until TLS has been negotiated, then a  
HELLO
	packet is resent with username/password information.

Password Length (required):
	The length of the password.  Password lengths limited to 256 bytes.

Key space ID (required):
	The key space that the client is interacting with.  Each keyspace has  
its own
	username/password (think ISPs).

Username (optional):
	The username for the connection.  This field is not sent if the  
username
	length is zero or if bit zero of the options byte is empty.  Username  
is not
	zero-padded.

Password (optional):
	The password for the connection.  This field is not sent if the  
password
	length is zero or if bit zero of the options byte is empty.  Password  
is not
	zero-padded.



The STORE packet:

The store packet is the only way to add, set, or replace data on server.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Request    |   Options 1   |   Options 2   |   Options 3   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Client Flags          |       Zero filled padding     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Time High Bits                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Time Low Bits                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          Key Length                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Value Length                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                       Max Fetch Count                         /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                              Key                              /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                             Value                             /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Request (required):
	All packets other than the HELLO packet contains a Request byte.
	Each value for the request byte is unique for the protocol
	version.  The STORE packet has a request value of 's'.

Options 1 (required):
	These bits refer to the bits in the Options 1 Byte.

	Bit 0:	If this key doesn't exist, add the key/value
	Bit 1:	If this key does exist, replace its value
	Bit 2:	Expiration is absolute
	Bit 3:	After the server has processed this request, close the
			connection.
	Bit 4:	Connection includes a max FETCH count.
	Bit 5:	Increment the value by the integer found in the Value
			field.  If the value length is zero, increment the value
			by one.  If Bit 0 is set, a key is created with this value
			and is the value zero is incremented by the increment value.
			This bit is mutually exclusive of Bits 1 and 6.
	Bit 6:	Decrement the value by the integer found in the Value
			field.  If the value length is zero, decrement the value
			by one.  If Bit 0 is set, a key is not created.  This bit
			is mutually exclusive of Bits 1 and 5.
	Bit 7:	The key is a statistical value.  This bit is mutually exclusive
			of all other Options Bits.

Options 2 (required):
	These bits refer to the bits in the Options 2 Byte.

	Bit 0:	Update the expiration of this time to the absolute, but keep  
the same value.
	Bit 1-7:	Not designated

Options 3 (required):
	These bits refer to the bits in the Options 3 Byte.

	Bit 0-7:	Not designated

Client Flags (requred):
	Arbitrary client flags.

Zero Filled Padding (required):
	Empty.  Could be used for something in the future.

Time High Bits (required):
	This field contains high bits for 64 bit time.  Until the year 2038,
	this field will be zero.  Post 2038, this field will contain the
	overflow from 32 bit time values.  See Time Low Bits for further
	details.

Time Low Bits (required):
	This specifies in seconds the expiration of the key.  The Time Low
	Bits Field is used to carry the expiration time when using 32 bit
	time values.  Expiration of keys work as follows:

	If Bit 2	of the Options 1 Byte is set, this value specifies the
	expiration of a key in seconds from the Epoch.  If Bit 2 of the
	Options 1 Byte is not set, this value specifies the relative expiration
	of a key in seconds from the current time.

Key Length (required):
	Specify the length of the key.  The key length has a valid range
	of 1 - 4294967296.

Value Length (required):
	Specify the length of the value.  The value length may be

Max Fetch Count (optional):
	Specify the maximum number of times this key can be fetched before the
	server deletes the key.  This value is required if Bit 4 of the Options
	1 Byte is set.

Key (required):
	The key for the given request.  Keys are not padded by a null  
character.

Value (optional):
	The value for the given request.  A zero length value requires that
	this value be empty.  Values are not padded by a null character.


The FETCH Packet:

The FETCH Packet is the only way to fetch data from the server.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Request    |   Options 1   |   Options 2   |   Options 3   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          Key Length                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                              Key                              /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Request (required):
	All packets other than the HELLO packet contains a Request byte.
	Each value for the request byte is unique for the protocol
	version.  The FETCH Packet has a request value of 'f'.

Options 1 (required):
	These bits refer to the bits in the Options 1 Byte.

	Bit 0:	If this key exists and has a relative expiration, reset
			the expiration to be be relative to the current time.
	Bit 1:	Request that the server delete the key after sending the
			value to the client.
	Bit 2:	After the server has processed this request, close the
			connection.
	Bit 3:	If the key exists, include the expiration of the key in
			the response from the server.
	Bit 4:	If the key exists, include the number of fetch requests
			left for this key.
	Bit 5-7:	Not designated

Options 2 (required):
	These bits refer to the bits in the Options 2 Byte.

	Bit 0-7:	Not designated

Options 3 (required):
	These bits refer to the bits in the Options 3 Byte.

	Bit 0-7:	Not designated

Key Length (required):
	Specify the length of the key.  The key length has a valid range
	of 1 - 4294967296.

Key (required):
	The key for the given request.  Keys are not padded by a null  
character.


The ERROR Packet:

The ERROR Packet is one of the ways a server responds to client  
requests.  Not all ERROR Packets are fatal errors and indeed, the  
server responds with an ERROR Packet after a STORE Packet has been  
processed by the server.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Response   |  Conn. Status |  Major Status |  Minor Status |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Message Length                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                            Message                            /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Response (required):
	All packets from the server contain a Response Byte.  Each
	value for the request byte is unique for the protocol version.
	The ERROR Packet has a request value of 'e'.

Connection Status (required):
	The Connection Status Field contains a response that determines the
	status of this server.  The Connection Status Codes are represented
	in ASCII that way they are easily human decipherable.  Possible values
	for the Connection Status Field are:

	'C': Server has closed this connection and asked the client to not
		reconnect to this server again.

	'c':	Server has closed this connection.

	'F':	Server has experienced a fatal server error and closed all
		client connections, has shut down, and should be removed from
		server lists.

	'f': Server has experienced a fatal server error and closed all
		client connections, has shut down, but will restart itself.

	'g': The connection status is good, golden, groovy, great, good to go,
		super green, all things nominal, or "locked, loaded, rocked, and
		ready to roll."  Connection remains open.

	'R': Server has requested that this server be removed from the server
		list.  Connection has been closed by the server.

	'r': Server has requested that this server be removed from the server
		list, but the connection remains available.


Major Status (required):
	The Major Status Field contains a status code that specifies the
	type of error the server had.  Status codes are represented by ASCII
	characters that way responses are human decipherable.  Possible
	values for the Status Field are:

	'd':	Unable to decrement
	'i':	Unable to increment
	'n': Not Found
	'N':	Not stored
	'S':	Stored


Minor Status (required):
	The Minor Status Field contains a status code that specifies the
	type of error the server had.  Status codes are represented by ASCII
	characters that way responses are human decipherable.  The Minor Status
	Field contains values that are specific to the Major Status code.  All
	Minor Status Codes support the ability to return ' ' which indicates
	that the client should see the Message for additional information.

	Major Status "d"'s available Minor Status Codes:
		'i':	Invalid value
		'd':	Key does not exist

	Major Status "i"'s available Minor Status Codes:
		'i':	Invalid value
		'd':	Key does not exist

	Major Status "N"'s available Minor Status Codes:
		'a':	Already stored
		'd':	Key does not exist

	Major Status "n"'s available Minor Status Codes:
		'd':	Key does not exist
		'D': Key used to exist, but has been deleted

	Major Status "S"'s available Minor Status Codes:
		' ':	No problems

Message Length (required):
	If an additional error message is sent by the server, this field
	will contain a non-zero response.

Message (optional):
	The Message Field is required if the Message Length Field is
	non-zero.  This message contains any additional error response
	that can't be derived from the Status Field.  The Messsage Field is
	not padded by a null character.



The RESPONSE packet:

A RESPONSE Packet contains some form of a value response from a FETCH  
request.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Response   | Conn. Status  |           Client Flags        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Options 1   |   Options 2   |   Options 3   |   Options 4   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Value Length                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                        Time High Bits                         /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                        Time Low Bits                          /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                       Max Fetch Count                         /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                            Value                              /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Response (required):
	All packets from the server contain a Response Byte.  Each
	value for the request byte is unique for the protocol version.
	The RESPONSE packet has a request value of 'r'.

Connection Status (required):
	See the Connection Status Field description in the ERROR Packet
	description.

Client Flags (required):
	Any client flags set be the client when the key was stored.

Options 1 (required):
	Bit 0:	Absolute expiration included in response.
	Bit 1:	Relative expiration included in response.
	Bit 2:	Max fetch count before request included in response.  If
			zero, there is no max fetch count set for this key.  A
			fetch count of 1 means this key expires immediately after
			it has been sent to the client.
	Bit 3-7:	Not designated

Options 2 (required):
	Bit 0-7:	Not designated

Options 3 (required):
	Bit 0-7:	Not designated

Options 4 (required):
	Bit 0-7:	Not designated

Value Length (required):
	Specify the length of the value.  The key length has a valid range
	of 0 - 4294967296.

Value (optional):
	If the Value Length Field is greater than zero, this field is required.
	A zero length value is valid.


Additional Notes:

If a client connects and sends an invalid request that is out of bounds  
for the protocol, the server with a plain text error message and closes  
the connection.  The format for the plain text error response is:

ERROR [code]: [message]\n
[custom message]\n
<server closes connection>

Stats:  The only thing I haven't spec'ed out is a way of pulling all  
stats in one go.  I guess it could just be a special stats key.


I plan on rewriting this in nroff(7)/mdoc(7) that way the formating is  
consistent but haven't had the time yet.  I'm tempted to rename the  
RESPONSE Packet to the DATA Packet and rename the ERROR Packet to the  
RESPONSE Packet, but haven't yet... I probably will though.  There are  
a few other message types that need to be added, such as a stats  
message (server list update?).  Thanks in advance.  Comments/discussion  
welcome.  I'll assume a non-response as approving, though private  
emails w/ some form of approval appreciated.  -sc

-- 
Sean Chittenden



More information about the memcached mailing list