Extensible command syntax

Thu Nov 8 09:41:07 UTC 2007

On Nov 8, 2007, at 0:23, Tomash Brechko wrote:

> But this doesn't make the scheme more flexible, because this way I
> can't use INRC to both update the value _and_ refresh the entry.
> Whatever predetermined approach you choose, you'd close the door for
> other possible uses.

	I didn't say it was more flexible.  I was just saying it's flexible  
enough.

	It would make sense to have a separate command for updating the flags  
or expiration of a record if that is really of interest.

	The strong desire to keep the commands small and orthogonal was  
expressed at the first meeting where we were going over binary protocol.

>> 	No, this get mechanism works fine, and I pipeline everything heavily
>> with great success.  You can get values back in any order and you are
>> notified when all of the results are available.
>
> This depends on how you look at it.  I mean _sequential_ pipelining
> (as pipelines actually are), while you are talking about batch
> processing.  With sequential pipelining, I push requests, and fetch
> results, and once there's direct one-to-one correspondence between
> request and response, I don't have to have any additional logic on
> client side.  I.e. if I have a list of keys, I can push them to the
> server, and fetch the results in order.  I don't have to have a hash
> on the client to decide where the particular result belongs.

	I fail to see what I'm missing.  As far as I can tell, you're  
describing what I already do.  See my write up on client optimization  
and let me know what I'm missing.

	http://bleu.west.spy.net/~dustin/projects/memcached/optimization.html

	Note that in my client I can issue several distinct requests and wait  
(blocking or not) for the results in any order I feel like.

	In the text protocol, a get with several keys only returns hits and  
an end marker.  The idea is that if you're issuing that request,  
you're probably going to return some kind of dictionary structure to  
something.

	In the binary protocol, the concept of a ``multi-get'' was removed in  
favor of a ``quiet get'' (no result on miss) and a noop.  You achieve  
the same effect, *or*, you can decide you do want NAKs for every miss  
if you want to from your client.  You could also replace the last  
request with a non-quiet get to optimize out the noop if you wanted.

	In *both* cases, I don't see how I could pipeline any more than I am  
today.

>> A get across a couple of thousand keys is a one line response in a
>> case where none exist (or one message response in the binary
>> protocol).
>
> I'd rather optimize for the "found" case.  Suppose your request has a
> large number of keys, and only last one matches.  The client has to
> wait till the very end (batch mode), while with pipelining it could
> start the processing of not found entries right away.

	Ah, well in the general case, there's no processing to do for not  
found keys.  If I've optimized several threads' requests together in  
such a way that I could theoretically know that all of the requests  
for one has been satisfied, then I could send that one away sooner.

	I would suspect, however, that the time difference is negligible.  A  
multi-get is generally considered faster than a series of individual  
gets in the text protocol, and they're just barely different in the  
binary protocol (to the point where I could simply change what my  
multi-get implementation does to measure the difference).

> Another advantage of flexible text protocol is that once it's there,
> you don't have to update all text clients (Perl, PHP, etc.)  when you
> add new parameter to some command, given that they have the means to
> send arbitrary text request.  I.e., it will always be
>
>  $memcached->set($key, $val, @params);
>
> not
>
>  $memcached->new_cas_command(...);

	I'm not sure that's a huge advantage.  You have to know you're doing  
a CAS, and they'd both probably be implemented as:

   $memcached->send_cmd(...);

	anyway.

>> 	It's a given that the current protocol isn't perfect.  That's
>> why we made a new one.  You should complain about that one more.  :)
>
> BTW, is there a description of this binary protocol?

	There's not a very good one anywhere.  doc/binary-protocol-plan.txt  
has preliminary documentation which will somewhat explain the spirit  
of the protocol.  It's not been updated since more of the details were  
agreed upon in the second meeting.  There are, however, a couple of  
implementations you can read that should help you to understand how  
these protocols are implemented:

	The initial test client and server code I wrote after the first  
meeting (and have kept up-to-date since then) is probably the best  
reference that exists at the moment:

	http://hg.west.spy.net/hg/python/memcached-test/


	My latest memcached binary server tree is available here (tree auto- 
updated whenever I push my patch stack):

	http://hg.west.spy.net/hg/hacks/memcached-binary-full/archive/tip.tar.gz


	My java client has a pretty solid binary protocol implementation:

	http://hg.west.spy.net/hg/memcached/

-- 
Dustin Sallings