non-blocking IO in clients (and other questions)

Tue Oct 2 17:59:43 UTC 2007

On Oct 2, 2007, at 9:36 , Brian Aker wrote:

> So I am looking at increasing the performance in libmemcached.  
> Looking at how some of the other clients are implemented I am  
> finding a catch-22 that I am hoping someone can explain.
>
> Most clients seem to be setting their IO to non-blocking, which is  
> excellent, but I don't understand what this is really buying since:
> 1) Clients are not threaded

	I don't quite understand why you're implying non-blocking IO and  
threading must go together.  Many people implement threads just  
because non-blocking IO appears to require more thought (in reality,  
it seems to be the other way around, but that's a different issue).

	My client is used in threaded environments, but only has one thread  
dedicated to IO multiplexing.  It's performing non-blocking IO over  
as many connections as it needs... sending and receiving whenever  
it's possible and completing requests when enough data arrive.

> 2)  The protocol always sends an ACK of some sort.

	The interface provided to my client doesn't require the caller to  
wait for ACKs.  You tend to want to do that for get requests, but you  
may not care in the case of deletes or sets.

	That is to say, you generally don't want to not know when something  
is over (in the case of quiet gets in the binary protocol, you'll  
want a noop or a regular get at the end), but you can't really send a  
quiet get and then wait just in case something starts arriving.   
Instead, just stream requests out and stream responses in.  Line them  
up, and you're good to go.

	Non-blocking IO means you're only waiting when there's nothing to do.

> Take "set" for example. I can do a "set" which is non-blocking, but  
> then I have to sit and spin either in the kernel or in user space  
> waiting for the "STORED" to be returned. This seems to defeat the  
> point of non-blocking IO.

	You don't have to at all.  A set is issued, and the state of the op  
is changed to waiting_for_response or something and it's added to an  
input queue.  Then you start sending the next operation from your  
output queue.  If a server starts sending stuff back to you, it's for  
whatever's on the top of your input queue (in the binary protocol,  
you can double-check this).

> I must be missing something about the above, since I can't see why  
> there is a benefit to dealing with non-blocking IO on a set, if you  
> will just end up waiting on the read() (ok, recv()).

	Not with my client (unless you want to).  :)

> On a different related note, I've noticed another issue with "set".  
> When I send a "set foo 0 0 20\r\n", I have to just send that  
> message. I can't just drop the "set" and the data to be stored in  
> the same socket. If I do that, then the server removes whatever  
> portion of the key that was contained in the "set". Maybe this is  
> my bug (though I can demonstrate it), but that seems like a waste.  
> AKA if on the server its doing a read() for the set and tossing out  
> the rest of the packet then its purposely causing two roundtrips  
> for the same data.

	By ``socket,'' do you mean ``packet?''  My client pipelines request  
in such a way that multiple gets, sets, deletes, etc... can easily  
get stuffed into the same packet.

> Looking through all of this, I am hoping that the binary protocol,  
> which I eagerly await reading, has a "set" which doesn't bother to  
> tell me what the result of the "set" was. You could pump a lot more  
> data into memcached if this was the case.

	We can create a qset, but the semantics would need to be carefully  
considered.  qget just keeps its errors silent and only returns  
positive results.  Should a qset do the opposite, or should it never  
return anything at all?

	Here's a fun exercise to do with memcached:

	Write out a bunch of set commands to a text file, followed by a  
quit.  Pipe that into nc with output to /dev/null.  This will do  
various fun pipelining and basically show you how fast it's possible  
to write.  The speed isn't all that much of a protocol issue.

-- 
Dustin Sallings


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20071002/b1ae39ea/attachment.htm