Binary Protocol...
Sean Chittenden
sean at chittenden.org
Wed Dec 8 14:03:25 PST 2004
Howdy all. In the interests of feature growth and moving away from the
convenient, but rather expensive text protocol, I'd like to propose the
binary memcache protocol. If you haven't ever worked with binary
protocols, please refrain from the technical aspects of this discussion
as the potential for a bikeshed[1] is huge. If you're not seeing the
feature you're interested in (with the exception of HTTP protocol
support!), please drop me a line offlist that way I can add it
accordingly. To those innocent by standards on the list who don't care
one way or another, I apologize in advance if this is of no interest.
Lastly, because there is protocol support for a feature does *not* mean
that the feature will be present when this is implemented, nor that the
feature will be added ever. Thanks.
[1] Please refer to:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/
misc.html#BIKESHED-PAINTING
http://www.freebsd.org/doc/en_US.ISO8859-1/articles/mailing-list-faq/
bikeshed.html
The memcache(4) binary protocol:
The memcache(4) binary protocol consists of a few basic packets that
accomplish all of the current functionality and adds new functionality
to memcached(8)'s feature set. One of the large goals of this exercise
is to maintain the one packet per trip. In a few cases, most notably
error handling, this falls apart, but for all intents and purposes,
this should work rather well. With the advent of libmemcache(3) as a
library that can be used in any context, portability problems or
different handling between language bindings should be minimized.
Also, since libmemcache(3) is written in C, it's possible for faster
ways of sending packets to be achieved, such as the use of T/TCP on
FreeBSD (or the use of SCTP, which seems to be the next generation of
the T/TCP concept). If you're familiar with routing protocols, this
will look rather familiar.
The HELLO Packet:
The HELLO Packet must be sent by the client when the connection is
established. Other packets should be piggy backed on the tail of the
HELLO packet that way only one TCP packet (and ideally only one
ehternet frame) is sent across the network.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Options | User Length | Passwd Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Key Space ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Username /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Password /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Version (required):
8 bit protocol version. The high bit is always set as a way
of distinguishing between the binary protocol and text protocol.
127 different versions should be sufficient.
Options (required):
These bits refer to the bits in the Options Byte.
Bit 0: Connection provides authentication information
Bit 1: This client connection requires TLS
Bit 2: Disconnect if TLS can not be negotiated
Bit 3-7: Not designated
Username Length (required):
The length of the username. Username lengths limited to 256 bytes.
Username
and passwords are left empty until TLS has been negotiated, then a
HELLO
packet is resent with username/password information.
Password Length (required):
The length of the password. Password lengths limited to 256 bytes.
Key space ID (required):
The key space that the client is interacting with. Each keyspace has
its own
username/password (think ISPs).
Username (optional):
The username for the connection. This field is not sent if the
username
length is zero or if bit zero of the options byte is empty. Username
is not
zero-padded.
Password (optional):
The password for the connection. This field is not sent if the
password
length is zero or if bit zero of the options byte is empty. Password
is not
zero-padded.
The STORE packet:
The store packet is the only way to add, set, or replace data on server.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request | Options 1 | Options 2 | Options 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Client Flags | Zero filled padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time High Bits |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time Low Bits |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Key Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Max Fetch Count /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Key /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Value /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Request (required):
All packets other than the HELLO packet contains a Request byte.
Each value for the request byte is unique for the protocol
version. The STORE packet has a request value of 's'.
Options 1 (required):
These bits refer to the bits in the Options 1 Byte.
Bit 0: If this key doesn't exist, add the key/value
Bit 1: If this key does exist, replace its value
Bit 2: Expiration is absolute
Bit 3: After the server has processed this request, close the
connection.
Bit 4: Connection includes a max FETCH count.
Bit 5: Increment the value by the integer found in the Value
field. If the value length is zero, increment the value
by one. If Bit 0 is set, a key is created with this value
and is the value zero is incremented by the increment value.
This bit is mutually exclusive of Bits 1 and 6.
Bit 6: Decrement the value by the integer found in the Value
field. If the value length is zero, decrement the value
by one. If Bit 0 is set, a key is not created. This bit
is mutually exclusive of Bits 1 and 5.
Bit 7: The key is a statistical value. This bit is mutually exclusive
of all other Options Bits.
Options 2 (required):
These bits refer to the bits in the Options 2 Byte.
Bit 0: Update the expiration of this time to the absolute, but keep
the same value.
Bit 1-7: Not designated
Options 3 (required):
These bits refer to the bits in the Options 3 Byte.
Bit 0-7: Not designated
Client Flags (requred):
Arbitrary client flags.
Zero Filled Padding (required):
Empty. Could be used for something in the future.
Time High Bits (required):
This field contains high bits for 64 bit time. Until the year 2038,
this field will be zero. Post 2038, this field will contain the
overflow from 32 bit time values. See Time Low Bits for further
details.
Time Low Bits (required):
This specifies in seconds the expiration of the key. The Time Low
Bits Field is used to carry the expiration time when using 32 bit
time values. Expiration of keys work as follows:
If Bit 2 of the Options 1 Byte is set, this value specifies the
expiration of a key in seconds from the Epoch. If Bit 2 of the
Options 1 Byte is not set, this value specifies the relative expiration
of a key in seconds from the current time.
Key Length (required):
Specify the length of the key. The key length has a valid range
of 1 - 4294967296.
Value Length (required):
Specify the length of the value. The value length may be
Max Fetch Count (optional):
Specify the maximum number of times this key can be fetched before the
server deletes the key. This value is required if Bit 4 of the Options
1 Byte is set.
Key (required):
The key for the given request. Keys are not padded by a null
character.
Value (optional):
The value for the given request. A zero length value requires that
this value be empty. Values are not padded by a null character.
The FETCH Packet:
The FETCH Packet is the only way to fetch data from the server.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request | Options 1 | Options 2 | Options 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Key Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Key /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Request (required):
All packets other than the HELLO packet contains a Request byte.
Each value for the request byte is unique for the protocol
version. The FETCH Packet has a request value of 'f'.
Options 1 (required):
These bits refer to the bits in the Options 1 Byte.
Bit 0: If this key exists and has a relative expiration, reset
the expiration to be be relative to the current time.
Bit 1: Request that the server delete the key after sending the
value to the client.
Bit 2: After the server has processed this request, close the
connection.
Bit 3: If the key exists, include the expiration of the key in
the response from the server.
Bit 4: If the key exists, include the number of fetch requests
left for this key.
Bit 5-7: Not designated
Options 2 (required):
These bits refer to the bits in the Options 2 Byte.
Bit 0-7: Not designated
Options 3 (required):
These bits refer to the bits in the Options 3 Byte.
Bit 0-7: Not designated
Key Length (required):
Specify the length of the key. The key length has a valid range
of 1 - 4294967296.
Key (required):
The key for the given request. Keys are not padded by a null
character.
The ERROR Packet:
The ERROR Packet is one of the ways a server responds to client
requests. Not all ERROR Packets are fatal errors and indeed, the
server responds with an ERROR Packet after a STORE Packet has been
processed by the server.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Response | Conn. Status | Major Status | Minor Status |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Message Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Message /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Response (required):
All packets from the server contain a Response Byte. Each
value for the request byte is unique for the protocol version.
The ERROR Packet has a request value of 'e'.
Connection Status (required):
The Connection Status Field contains a response that determines the
status of this server. The Connection Status Codes are represented
in ASCII that way they are easily human decipherable. Possible values
for the Connection Status Field are:
'C': Server has closed this connection and asked the client to not
reconnect to this server again.
'c': Server has closed this connection.
'F': Server has experienced a fatal server error and closed all
client connections, has shut down, and should be removed from
server lists.
'f': Server has experienced a fatal server error and closed all
client connections, has shut down, but will restart itself.
'g': The connection status is good, golden, groovy, great, good to go,
super green, all things nominal, or "locked, loaded, rocked, and
ready to roll." Connection remains open.
'R': Server has requested that this server be removed from the server
list. Connection has been closed by the server.
'r': Server has requested that this server be removed from the server
list, but the connection remains available.
Major Status (required):
The Major Status Field contains a status code that specifies the
type of error the server had. Status codes are represented by ASCII
characters that way responses are human decipherable. Possible
values for the Status Field are:
'd': Unable to decrement
'i': Unable to increment
'n': Not Found
'N': Not stored
'S': Stored
Minor Status (required):
The Minor Status Field contains a status code that specifies the
type of error the server had. Status codes are represented by ASCII
characters that way responses are human decipherable. The Minor Status
Field contains values that are specific to the Major Status code. All
Minor Status Codes support the ability to return ' ' which indicates
that the client should see the Message for additional information.
Major Status "d"'s available Minor Status Codes:
'i': Invalid value
'd': Key does not exist
Major Status "i"'s available Minor Status Codes:
'i': Invalid value
'd': Key does not exist
Major Status "N"'s available Minor Status Codes:
'a': Already stored
'd': Key does not exist
Major Status "n"'s available Minor Status Codes:
'd': Key does not exist
'D': Key used to exist, but has been deleted
Major Status "S"'s available Minor Status Codes:
' ': No problems
Message Length (required):
If an additional error message is sent by the server, this field
will contain a non-zero response.
Message (optional):
The Message Field is required if the Message Length Field is
non-zero. This message contains any additional error response
that can't be derived from the Status Field. The Messsage Field is
not padded by a null character.
The RESPONSE packet:
A RESPONSE Packet contains some form of a value response from a FETCH
request.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Response | Conn. Status | Client Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options 1 | Options 2 | Options 3 | Options 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Time High Bits /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Time Low Bits /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Max Fetch Count /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ Value /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Response (required):
All packets from the server contain a Response Byte. Each
value for the request byte is unique for the protocol version.
The RESPONSE packet has a request value of 'r'.
Connection Status (required):
See the Connection Status Field description in the ERROR Packet
description.
Client Flags (required):
Any client flags set be the client when the key was stored.
Options 1 (required):
Bit 0: Absolute expiration included in response.
Bit 1: Relative expiration included in response.
Bit 2: Max fetch count before request included in response. If
zero, there is no max fetch count set for this key. A
fetch count of 1 means this key expires immediately after
it has been sent to the client.
Bit 3-7: Not designated
Options 2 (required):
Bit 0-7: Not designated
Options 3 (required):
Bit 0-7: Not designated
Options 4 (required):
Bit 0-7: Not designated
Value Length (required):
Specify the length of the value. The key length has a valid range
of 0 - 4294967296.
Value (optional):
If the Value Length Field is greater than zero, this field is required.
A zero length value is valid.
Additional Notes:
If a client connects and sends an invalid request that is out of bounds
for the protocol, the server with a plain text error message and closes
the connection. The format for the plain text error response is:
ERROR [code]: [message]\n
[custom message]\n
<server closes connection>
Stats: The only thing I haven't spec'ed out is a way of pulling all
stats in one go. I guess it could just be a special stats key.
I plan on rewriting this in nroff(7)/mdoc(7) that way the formating is
consistent but haven't had the time yet. I'm tempted to rename the
RESPONSE Packet to the DATA Packet and rename the ERROR Packet to the
RESPONSE Packet, but haven't yet... I probably will though. There are
a few other message types that need to be added, such as a stats
message (server list update?). Thanks in advance. Comments/discussion
welcome. I'll assume a non-response as approving, though private
emails w/ some form of approval appreciated. -sc
--
Sean Chittenden
More information about the memcached
mailing list