'noreply' over the wire
sgrimm at facebook.com
Sun Nov 11 16:36:46 UTC 2007
On Nov 11, 2007, at 1:22 AM, Tomash Brechko wrote:
> What do you mean by streaming? How this can be done from
> Cache::Memcached? Have you actually read my mail where I talk why
> errors may be ignored for some applications?
Can you do it from that particular client? Maybe, maybe not; never
used it. Dustin's Java client can stream, though, for example. And an
implementation of streaming is going to look almost exactly like an
implementation of no-reply on the client side, except that it keeps a
list of what requests it has sent and knows to eat the responses
(possibly calling a callback function, etc.) next time it's invoked. I
bet Cache::Memcached can be modified to do that.
In fact, you're pretty much going to have to do something like that
regardless, and even then it still might not work. Say I send five no-
response commands in a row to the server. Then I do a "get" and I get
back a "SERVER_ERROR out of memory". Did the "get" fail? Maybe, or
maybe it was one of the earlier no-response commands that failed and
the response to the "get" hasn't arrived yet. How does the client tell
> What errors are you
> expecting form, say, 'delete'?
How about, "you've sent me a malformed key?" Sure, you can argue (and
so would I) that the client shouldn't send one in the first place, but
the protocol design cannot assume that clients will all be well-
behaved and bug-free 100% of the time.
> Your notion of real-world is limited to your own projects. There are
> applications where updates are no less frequent then queries.
Such as? All I'm asking for is a real-world use case. I don't care if
it's one of my own projects or not; I just want it to be a real
project that actually exists.
With a real project in hand, we can then urge whoever owns the project
to try out the patches and, if there is a measurable difference,
there'll be a pretty strong case for inclusion.
> Yes, 'noreply' won't help your project if your estimate of 99-1 is
> But if it may help _other_ projects, and will not affect performance
> of get-intensive projects like yours, will you allow the patches in?
I'm in favor of allowing the patches in if they *do* help other
projects, not if they *may* help other projects. If you look at the
past history of memcached performance and functionality patches, it
does not look like this:
1. Person writes patch that seems like a good idea
2. Patch gets integrated into source base
3. Everyone tests whether or not it helps
Instead, it looks like:
1. Person writes patch to fix shortcoming encountered in a project
2. Patch owner runs patch locally and gathers numbers to see if it helps
3. Patch gets posted to list and tested by other people with similar
4. If it helps, it gets integrated into source base
That is, integration into the source base is not where the
experimentation starts, which if I'm reading you correctly seems to be
the working assumption (it doesn't hurt, so why not let it in?) It's
instead the outcome if experimentation succeeds.
Doing it any other way is a recipe for a bloated, brittle code base
full of speculative changes that might or might not make any real
> Current cascade of strcmp() has a room for improvement.
Yes, that's why I replied in support of your GPerf change back when
you first posted it in isolation. I have no objection to improving the
parser. It's the no-reply protocol change that I'm skeptical about.
I'm not conflating the two.
But even in the case of parser improvements, I'm slightly skeptical.
For a "get"-heavy application -- which I believe is the most common
use case for memcached based on discussions on this list, even though
I agree it's not the only *possible* use case -- the current parser
has to do exactly one strcmp() call to detect the vast majority of
requests. For an "add", it has to do three, and for a "set" four.
Arguably those last two should be flipped; I imagine "set" is more
common than "add". And "delete" takes 8 strcmp() calls.
I'd want to see that GPerf is faster than those 8 strcmp() calls
(assuming here that get/set/delete are the most common commands in all
applications) and, more importantly, no slower than the one strcmp()
that is currently getting executed for "get" requests.
Adding the GPerf code will by necessity add a brand-new external
dependency to memcached, and (this is more subjective) make the code
harder to follow. If it doesn't hurt the typical use case and helps
some other real use cases, then I'm all for it, but it has to actually
produce a benefit and not be harmful in common cases to make up for
those two costs.
> Well, all I actually heard thus far were airily arguments in the
> spirit "I don't need this so why anyone would need this?". I don't
> actually think I have to prove anything to anyone. And I can't add
> anything to what I already said.
No, not quite. What you're hearing isn't, "I don't need this so why
would anyone need this?" It is instead, "What makes you need this?"
or, if you prefer, "Can you demonstrate an actual case where this
> For a development list, posting patches is alright, so I didn't do any
> harm :). If someone would find them useful, that would be great. If
> they won't go to the mainline, so be it, perhaps they are not that
Despite what you might think from my skepticism above, I totally agree
with the idea of posting patches to a development list! So I do have
to thank you for that; you've given us something concrete to look over.
memcached is a very stable program right now and, given that it's
already quite high-performance (for busy sites, its CPU consumption is
dominated by kernel interrupt-handling time) we want to be sure
there's a good reason to introduce substantial changes to the core of
the program. If there is, then great!
More information about the memcached