memcache API happiness for C and Ruby users...

Wed Jan 5 15:37:23 PST 2005

>> If anyone has any neat tricks for doing this in parallel, drop me a
>> line offline.  I refuse to do this in sequence/serially so any ideas
>> are going to require either async IO, or non-blocking IO.  Ideally I'd
>> use use pthreads, but that's less than portable (despite the p in
>> pthreads).  Who knows, maybe I'll start abstracting IO handling and 
>> can
>> sneak in the use of kqueue(2) or libevent(3).  Efficient reassembly
>
> libevent only recently became safe to use in multi-threaded
> applications.  Even though libmemcache isn't multi-threaded, it's
> currently thread safe (at least when used appropriately) and thats how 
> I
> use it so I'd like to see it stay that way.  I've had serious problems
> using libevent even for trivial tasks in a multi-threaded app (even
> using it in only one of the threads).  I'll admit I haven't tried the
> latest release which fixes some issues. (it's only 2 days old!)

Heh.  Given the quick adoption of memcache(3) in commercial 
applications, I'd be reluctant to bring in libevent(3) given it uses a 
four clause BSD license and I'm currently using the more permissive MIT 
license (haven't found a way to stick it to GPL users, but I'm still 
thinking).  To be honest, I like the way that thttpd(8) abstracts its 
IO handling and may head down a similar road.

>> that's not O(N^2) is my main concern, however.  Anything that's more
>> expensive than O(2N) isn't gunna cut it.  Anyway, like I said, I'll
>> probably get to the scatter/gather portion of memcache(3) soonish.  In
>> the meantime, enjoy the ruby bindings.  I'll do some benchmarking 
>> later
>> to justify to myself the need to add Memcache::Request and
>> Memcache::Response class (hint: need to prevent excessive object
>> creation in tight loops).

I've spent a fair amount of time thinking about this, and threads sure 
would make this easy, but I can only imagine the grumpy flames I'd get 
for that.  I think this is the course of action I'm going to take:

*) move the internal buffer from struct memcache to struct 
memcache_server.  This will use more memory, but I can't think of an 
alternative that would let me safely handle multi-get's that span 
servers.

*) create a set of IO routines that wrap select(2), poll(2), and 
kqueue(2) ala thttpd(8).

*) Add a void *misc struct member to struct memcache_re(q|s).

*) In mcm_get(), iterate through each key and find its server and group 
the keys into a new request that is dedicated for a specific server.  
Essentially this splices out and creates a struct memcache_req for each 
server.

*) Hand the request objects to a routine that iterates through the 
requests using non-blocking IO.  Each request object has a state added 
to its flags to signify its state (read/write).  At the end of each 
iteration through the list of requests being processed, it hasn't 
finished its IO (reading or writing), register the file descriptor with 
the IO routines above and wait to read/write more data.  Repeat/rinse.

*) Once all IO has been completed, begin reassembling data in the 
correct order using the misc pointer for the internally used 
memcache_res that were dispatched to each of the servers.

The advantages to this are:

*) Under ideal circumstances, it iterates through each request only 
twice.  Once when creating the appropriate struct memcache_re(q|s)'s, 
and once when assigning the values back to the appropriate memcache_res 
object that gets returned to the user via struct memcache_req.

*) If there is only one server, blocking IO can be used.

*) Reassembly is O(N).

*) Waiting on descriptors is abstracted and lets folks who have 
kqueue(2) available to them use kqueue(2) and removes the use of 
select(2) unless you need it.

And the disadvantages are:

*) Buffer size goes from only one per struct memcache object to one for 
every struct memcache_server object.

All things considered, I can't think of a better way to go that 
satisfies everyone's needs.  Unless someone can sport a good reason, 
I'll make the IO selection routines configure time dependent (ie, pmk 
-e use_(kqueue|poll|select)).  -sc

PS  I probably shouldn't bury this here at the bottom of a technical 
email, but I figure it's the gearheads who'd read this that would be 
most interested.  I'm looking to pick up roughly 20-40hrs of contract 
work in the month of January in the event that anyone has any memcache 
(C, PHP, Ruby, Perl, Java, etc. or general talks/lectures) or 
PostgreSQL needs.  Email me offline if you're interested.

-- 
Sean Chittenden