binary wire protocol in memcached
Tony Tung
ttung at facebook.com
Tue Sep 4 19:04:59 UTC 2007
Hi,
We have a working binary wire protocol implementation (client and server)
though it does not reflect the changes that occurred at the last hackathon
(at Sun). Some key differences:
1) The magic byte is different, and also unique to request and response. We
also maintain a notion of binary protocol versions for possible future
(albeit limited) expansion.
// version number and magic byte.
#define BP_VERSION 0x0
#define BP_REQ_MAGIC_BYTE (0x50 | BP_VERSION)
#define BP_REP_MAGIC_BYTE (0xA0 | BP_VERSION)
2) We specified the command IDs a bit differently, grouping them by
request/response type.
#define BIT(x) (1ULL << (x))
#define FIELD(val, shift) ((val) << (shift))
#define BP_E_E FIELD(0x0, 4)
#define BP_E_S FIELD(0x1, 4)
#define BP_K_V FIELD(0x2, 4)
#define BP_KV_E FIELD(0x3, 4)
#define BP_KN_E FIELD(0x4, 4)
#define BP_KN_N FIELD(0x5, 4)
#define BP_N_E FIELD(0x6, 4)
#define BP_S_E FIELD(0x7, 4)
#define BP_S_S FIELD(0x8, 4)
#define BP_QUIET BIT(3)
typedef enum bp_cmd {
// these commands go as an empty_req and return as an empty_rep.
BP_ECHO_CMD = (BP_E_E | FIELD(0x0, 0)),
BP_QUIT_CMD = (BP_E_E | FIELD(0x1, 0)),
// these commands go as an empty_req and return as a string_rep.
BP_VER_CMD = (BP_E_S | FIELD(0x0, 0)),
BP_SERVERERR_CMD = (BP_E_S | FIELD(0x1, 0)), // this is actually not a
// command. this is
solely
// used as a response
when
// the server wants to
// indicate an error
status.
// these commands go as a key_req and return as a value_rep.
BP_GET_CMD = (BP_K_V | FIELD(0x0, 0)),
BP_GETQ_CMD = (BP_K_V | BP_QUIET | FIELD(0x0, 0)),
// these commands go as a key_value_req and return as an empty_rep.
BP_SET_CMD = (BP_KV_E | FIELD(0x0, 0)),
BP_ADD_CMD = (BP_KV_E | FIELD(0x1, 0)),
BP_REPLACE_CMD = (BP_KV_E | FIELD(0x2, 0)),
BP_APPEND_CMD = (BP_KV_E | FIELD(0x3, 0)),
BP_SETQ_CMD = (BP_KV_E | BP_QUIET | FIELD(0x0, 0)),
BP_ADDQ_CMD = (BP_KV_E | BP_QUIET | FIELD(0x1, 0)),
BP_REPLACEQ_CMD = (BP_KV_E | BP_QUIET | FIELD(0x2, 0)),
BP_APPENDQ_CMD = (BP_KV_E | BP_QUIET | FIELD(0x3, 0)),
// these commands go as a key_number_req and return as an empty_rep.
BP_DELETE_CMD = (BP_KN_E | FIELD(0x0, 0)),
BP_DELETEQ_CMD = (BP_KN_E | BP_QUIET | FIELD(0x0, 0)),
// these commands go as a key_number_req and return as a number_rep.
BP_INCR_CMD = (BP_KN_N | FIELD(0x0, 0)),
BP_DECR_CMD = (BP_KN_N | FIELD(0x1, 0)),
// these commands go as a number_req and return as an empty_rep.
BP_FLUSH_ALL_CMD = (BP_N_E | FIELD(0x0, 0)),
// these commands go as a string_req and return as an empty_rep.
BP_FLUSH_REGEX_CMD = (BP_S_E | FIELD(0x0, 0)),
// these commands go as a string_req and return as a string_rep.
BP_STATS_CMD = (BP_S_S | FIELD(0x0, 0)),
} bp_cmd_t;
3) We have no support for 64-bit arithmetic commands. We have retained decr
(so far).
4) The return value for arithmetic commands is binary and not a string. I
wasn¹t at the hackathon during which this was decided so I¹d be interested
in finding out why a string is preferable.
Thanks,
Tony
On 8/27/07 8:39 PM, "Dustin Sallings" <dustin at spy.net> wrote:
>
>
> I just got memcached passing all of my tests with the binary wire
> protocol (tcp). Sorry it took me so long to get going on it.
>
> I've implemented this as a patch stack (using mercurial queues) over
> trunk r608 with Evan Miller's 64-bit counter patch applied (since
> everyone is excited by big numbers). I've got no idea where (if
> anywhere) active development is going on. I can pick it up and move
> it needed.
>
> I don't know what clients are ready. I haven't done a release of my
> java client with binary protocol support yet, but I can roll out a
> pre-release if someone wants to try it (all of my interop tests pass
> between my java client and memcached in binary mode, so I just need
> to clean it up a bit). My test python client is functional for at
> least interactive use as well.
>
>
> Known issues:
>
> I didn't implement a timeout on delete. I suppose I should, but do
> people actually use that?
>
> I didn't implement a timeout on flush. I'd rather avoid that one if
> possible because the semantics are really confusing.
>
> I basically ignored managed buckets because I have no idea how they
> work.
>
> My 64-bit ntohl kind of thing might not work on big endian machines
> (I haven't tried it yet). Moreover, I would hope something like that
> would already exist somewhere, so I didn't bother making it fast.
>
> I've only implemented version, flush, noop, set, add, replace, get,
> getq, delete, and incr. Besides stats (for which we have no clearly
> defined packet format), there seem to be other commands in there that
> don't make much sense to me.
>
> There may be more, but I don't know about them yet.
>
> --
> Dustin Sallings
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070904/75abcc11/attachment-0001.html
More information about the memcached
mailing list