'noreply' over the wire

Sun Nov 11 16:36:46 UTC 2007

On Nov 11, 2007, at 1:22 AM, Tomash Brechko wrote:
> What do you mean by streaming?  How this can be done from
> Cache::Memcached?  Have you actually read my mail where I talk why
> errors may be ignored for some applications?

Can you do it from that particular client? Maybe, maybe not; never  
used it. Dustin's Java client can stream, though, for example. And an  
implementation of streaming is going to look almost exactly like an  
implementation of no-reply on the client side, except that it keeps a  
list of what requests it has sent and knows to eat the responses  
(possibly calling a callback function, etc.) next time it's invoked. I  
bet Cache::Memcached can be modified to do that.

In fact, you're pretty much going to have to do something like that  
regardless, and even then it still might not work. Say I send five no- 
response commands in a row to the server. Then I do a "get" and I get  
back a "SERVER_ERROR out of memory". Did the "get" fail? Maybe, or  
maybe it was one of the earlier no-response commands that failed and  
the response to the "get" hasn't arrived yet. How does the client tell  
the difference?

> What errors are you
> expecting form, say, 'delete'?

How about, "you've sent me a malformed key?" Sure, you can argue (and  
so would I) that the client shouldn't send one in the first place, but  
the protocol design cannot assume that clients will all be well- 
behaved and bug-free 100% of the time.

> Your notion of real-world is limited to your own projects.  There are
> applications where updates are no less frequent then queries.

Such as? All I'm asking for is a real-world use case. I don't care if  
it's one of my own projects or not; I just want it to be a real  
project that actually exists.

With a real project in hand, we can then urge whoever owns the project  
to try out the patches and, if there is a measurable difference,  
there'll be a pretty strong case for inclusion.

>  Yes, 'noreply' won't help your project if your estimate of 99-1 is  
> correct.
> But if it may help _other_ projects, and will not affect performance
> of get-intensive projects like yours, will you allow the patches in?

I'm in favor of allowing the patches in if they *do* help other  
projects, not if they *may* help other projects. If you look at the  
past history of memcached performance and functionality patches, it  
does not look like this:

1. Person writes patch that seems like a good idea
2. Patch gets integrated into source base
3. Everyone tests whether or not it helps

Instead, it looks like:

1. Person writes patch to fix shortcoming encountered in a project
2. Patch owner runs patch locally and gathers numbers to see if it helps
3. Patch gets posted to list and tested by other people with similar  
needs
4. If it helps, it gets integrated into source base

That is, integration into the source base is not where the  
experimentation starts, which if I'm reading you correctly seems to be  
the working assumption (it doesn't hurt, so why not let it in?) It's  
instead the outcome if experimentation succeeds.

Doing it any other way is a recipe for a bloated, brittle code base  
full of speculative changes that might or might not make any real  
difference.

> Current cascade of strcmp() has a room for improvement.

Yes, that's why I replied in support of your GPerf change back when  
you first posted it in isolation. I have no objection to improving the  
parser. It's the no-reply protocol change that I'm skeptical about.  
I'm not conflating the two.

But even in the case of parser improvements, I'm slightly skeptical.  
For a "get"-heavy application -- which I believe is the most common  
use case for memcached based on discussions on this list, even though  
I agree it's not the only *possible* use case -- the current parser  
has to do exactly one strcmp() call to detect the vast majority of  
requests. For an "add", it has to do three, and for a "set" four.  
Arguably those last two should be flipped; I imagine "set" is more  
common than "add". And "delete" takes 8 strcmp() calls.

I'd want to see that GPerf is faster than those 8 strcmp() calls  
(assuming here that get/set/delete are the most common commands in all  
applications) and, more importantly, no slower than the one strcmp()  
that is currently getting executed for "get" requests.

Adding the GPerf code will by necessity add a brand-new external  
dependency to memcached, and (this is more subjective) make the code  
harder to follow. If it doesn't hurt the typical use case and helps  
some other real use cases, then I'm all for it, but it has to actually  
produce a benefit and not be harmful in common cases to make up for  
those two costs.

> Well, all I actually heard thus far were airily arguments in the
> spirit "I don't need this so why anyone would need this?".  I don't
> actually think I have to prove anything to anyone.  And I can't add
> anything to what I already said.

No, not quite. What you're hearing isn't, "I don't need this so why  
would anyone need this?" It is instead, "What makes you need this?"  
or, if you prefer, "Can you demonstrate an actual case where this  
helps?"

> For a development list, posting patches is alright, so I didn't do any
> harm :).  If someone would find them useful, that would be great.  If
> they won't go to the mainline, so be it, perhaps they are not that
> useful.

Despite what you might think from my skepticism above, I totally agree  
with the idea of posting patches to a development list! So I do have  
to thank you for that; you've given us something concrete to look over.

memcached is a very stable program right now and, given that it's  
already quite high-performance (for busy sites, its CPU consumption is  
dominated by kernel interrupt-handling time) we want to be sure  
there's a good reason to introduce substantial changes to the core of  
the program. If there is, then great!

-Steve