binary wire protocol in memcached

Tony Tung ttung at facebook.com
Wed Sep 5 00:14:45 UTC 2007


Comments inline:

On 9/4/07 1:10 PM, "Dustin Sallings" <dustin at spy.net> wrote:

> 
> On Sep 4, 2007, at 12:04 , Tony Tung wrote:
> 
>> 1) The magic byte is different, and also unique to request and response.  We
>> also maintain a notion of binary protocol versions for possible future
>> (albeit limited) expansion.
> 
> It was decided that the request and response magic should be different.  I
> believe I added the definitions but didn't actually give them different
> values.
> 
I have an email from you on 8/23 in which you said, ³MAGIC BYTE 0xf  (for
both requests and responses)².
> 
>>  2) We specified the command IDs a bit differently, grouping them by
>> request/response type.
> 
> What is the value of this?  Is it not more valuable to just list them
> sequentially so implementors can easily tell what the commands are without
> computing them?
> 
One possible use for this is that packet sniffers can decode messages even
for unfamiliar commands.
> 
>>  #define BP_QUIET           BIT(3)
> 
> I suggested this in the first meeting and it was rejected.  The idea was to
> make commands (including quiet commands) with their specific semantics as
> needed.  For example, the desirable semantics of a quiet get are that there is
> no response body unless there's a success.  For a quiet set (if such a thing
> needs to exist), there should be no response body unless there's an error.
> 
BP_QUIET is not a command.  It is a flag that is or¹ed into the command to
get the command ID of the quiet version of the same command.
> 
> 
>>      // these commands go as an empty_req and return as an empty_rep.
>>      BP_ECHO_CMD        = (BP_E_E | FIELD(0x0, 0)),
> 
> We had discussed both an echo (BP_S_S) and a noop (BP_E_E).  I didn't
> implement echo, but my noop has your echo semantics, it seems.
> 
Correct.
> 
>>      BP_SERVERERR_CMD   = (BP_E_S | FIELD(0x1, 0)), // this is actually not a
>>                                                     // command.  this is
>> solely
>>                                                     // used as a response
>> when
>>                                                     // the server wants to
>>                                                     // indicate an error
>> status.
> 
> When does the server indicate an error status that's not in response to a
> command?
> 
One possible instance is when the server cannot parse the requests.  It has
no specific request to respond to in that situation.
> 
>>     // these commands go as a string_req and return as an empty_rep.
>>      BP_FLUSH_REGEX_CMD = (BP_S_E | FIELD(0x0, 0)),
> 
> Does this implementation exist?  People have been asking for it a lot.
> 
>>      // these commands go as a string_req and return as a string_rep.
>>      BP_STATS_CMD       = (BP_S_S | FIELD(0x0, 0)),
> 
> We discussed having more structure in the response to a stats command since
> there is structure in the stats result itself.  It makes sense to me to have
> the response be a list of sorts.
> 
That is something I¹ve considered as well, but for a first pass, I didn¹t
put too much thought into it.
> 
>>  3) We have no support for 64-bit arithmetic commands.  We have retained decr
>> (so far).
> 
> Decr was killed because it seemed mostly unnecessary and was semantically
> different (in implementation, anyway) from incr.  I fixed the latter bug in my
> tree and made sure the incr amount was unsigned with the 64-bit patch.
> 
> Obviously, decr still exists in spirit, as my client interface hasn't changed
> due to binary protocol implementation.
> 
>>  4) The return value for arithmetic commands is binary and not a string.  I
>> wasn¹t at the hackathon during which this was decided so I¹d be interested in
>> finding out why a string is preferable.
> 
> The string is required for the internal representation due to ``get'' requests
> expecting ascii representations of the numbers.  As the arithmetic operations
> are already working on string values, it made sense to go ahead and return a
> value from incr the same way that the same value would be returned from a get.
> 
> I suppose we could pack the results into a 64-bit number as well, by modifying
> add_delta to provide the counter to the response, but I don't think there's a
> lot to gain here.  As it is, I'm using an unmodified do_add_delta (well, with
> the 64-bit mods) for both text and bin.
> 
> The other difference is that we're allowing counters to (optionally) be
> created using incr.  At least one of the users of counters indicated he's
> sending an add and incr together to ensure the counter exists.  It seems
> natural to put these things together.  Based on values in your packet, you can
> either expect a NOT_FOUND sort of response, or an auto-created counter when
> incrementing a missing value.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070904/e6e63a7a/attachment.html


More information about the memcached mailing list