Binary protocol questions

Dustin Sallings dustin at spy.net
Sun Nov 11 20:54:04 UTC 2007


On Nov 11, 2007, at 3:30, Tomash Brechko wrote:

> AFAIU, this will change, and eventually binary protocol will be
> accepted to the mainline.  Little is known about it, but from the
> binary-protocol-plan.txt (and it was said that it's a bit outdated) we
> may see:

	It's not so much outdated as it is missing some detail.

> But what if the 'version' will become 0x61 ('a')?  How the server will
> distinguish between text and binary protocols then?  Can't you prepend
> another byte that will solve this (zero for instance)?  Some may
> think, "but why anyone would use text protocol when I use binary?",
> but the answer is in this doc file already:

	That's only an issue if you try to run it on the same port.  I, at  
least, don't do that.

	This does come up occasionally, but I don't understand what's so  
desirable about it.  It's another file descriptor, and there's not a  
way for a client to discover whether the server speaks the new  
protocol anyway.

> But memcached is one of zillions of projects, it will require long
> time until all of its protocol versions will be incorporated into all
> packet analyzers (if this will happen at all).  So having working text
> protocol might be useful.

	I don't really see someone saying, ``Hey, our packet captures show  
this weird protocol, can we reconfigure all of our clients to speak  
the text protocol so we can understand them?''  When people need it,  
they'll write it.  Some people may even write it without needing it in  
the first place.

> As you understand, any protocol have syntax and semantics, and text vs
> binary is _purely syntactic_ issue.  More than that, if both evolve
> separately, then text protocol with thought-out semantics may
> outperform binary with a poor one (parsing of text is slower, but
> number of commands, and thus key lookups, to do useful things might be
> less).  Shouldn't the semantics first be defined, and only then its
> encoding in text or binary form?

	We were trying to fix what we perceived as inadequate.  The semantics  
have mostly been fine for (as far as we know) everyone involved so  
far.  The areas where different things are needed were getting  
addressed.

	One of those areas was overhead in parsing.  Along the way, we were  
able to do neat stuff like tag responses with request opaques so you  
don't have to pass keys back in get requests.

> Once the semantics of the whole protocol is defined, it's time to
> think about its text and binary syntax.  And because the above implies
> optional parameters, both syntax should support these.  Then you
> implement such syntax, and do the benchmarks.  If you find that
> parsing of varying-width commands is indeed an issue, only then you
> add more fixed-width commands for frequent use (leaving varying-width
> commands in place, of course).
>
> This is what hackathon should have produce.

	Perhaps.  The concept was brought up a couple different times in a  
couple different ways, but Brad didn't like it.  His view is that  
different semantics should be expressed with different commands.   
There's nothing particularly terrible about that.

> So can someone calm me on my fears:
>
>  - binary protocol will _replace_ text protocol.

	It was pretty much understood that they'd both exist for a good long  
time.

>  - binary protocol will repeat some of the shortcomings of the
>    current text protocol (like mandatory exptime and flags).

	Um, I'd prefer to describe them as ``semantically compatible.''   
There are a few differences, but I implement both with the same  
interface in my client.

>  - binary protocol will be developed elsewhere, and then pushed to
>    the mainline on the basis "Works for us!".

	Sure.  It works fine, but it's not a priority to get it in yet.

> Though it may sound like the continuation of the 'noreply' fight, it
> is not.

	Well, you shouldn't have hit reply on that thread, then.  :)  (almost  
lost this message in there)

> I too want to have binary protocol, but not as a
> _replacement_ of the text protocol, and definitely not a broken one.


	I don't think it's particularly broken.  You're right that there's  
things that you can't do with it, but that doesn't necessarily mean  
that there's a problem because it may just be that people don't do them.

	For example, you can't replace without updating the expiration date.   
There are two problems with that:

	1)  replace probably shouldn't even exist.  It's a command that just  
adds symmetry to add, and it's quite possible nobody uses it.
	2)  Who is refreshing a cache value where they don't care when it  
expires immediately afterwards?  It would seem to me that refreshing  
the cache values would be a good time to refresh the expiration.

	I think you're confusing ``broken'' with ``I can imagine something  
you can't do.''

-- 
Dustin Sallings





More information about the memcached mailing list