memcache API happiness for C and Ruby users...
sean at chittenden.org
Wed Jan 5 15:37:23 PST 2005
>> If anyone has any neat tricks for doing this in parallel, drop me a
>> line offline. I refuse to do this in sequence/serially so any ideas
>> are going to require either async IO, or non-blocking IO. Ideally I'd
>> use use pthreads, but that's less than portable (despite the p in
>> pthreads). Who knows, maybe I'll start abstracting IO handling and
>> sneak in the use of kqueue(2) or libevent(3). Efficient reassembly
> libevent only recently became safe to use in multi-threaded
> applications. Even though libmemcache isn't multi-threaded, it's
> currently thread safe (at least when used appropriately) and thats how
> use it so I'd like to see it stay that way. I've had serious problems
> using libevent even for trivial tasks in a multi-threaded app (even
> using it in only one of the threads). I'll admit I haven't tried the
> latest release which fixes some issues. (it's only 2 days old!)
Heh. Given the quick adoption of memcache(3) in commercial
applications, I'd be reluctant to bring in libevent(3) given it uses a
four clause BSD license and I'm currently using the more permissive MIT
license (haven't found a way to stick it to GPL users, but I'm still
thinking). To be honest, I like the way that thttpd(8) abstracts its
IO handling and may head down a similar road.
>> that's not O(N^2) is my main concern, however. Anything that's more
>> expensive than O(2N) isn't gunna cut it. Anyway, like I said, I'll
>> probably get to the scatter/gather portion of memcache(3) soonish. In
>> the meantime, enjoy the ruby bindings. I'll do some benchmarking
>> to justify to myself the need to add Memcache::Request and
>> Memcache::Response class (hint: need to prevent excessive object
>> creation in tight loops).
I've spent a fair amount of time thinking about this, and threads sure
would make this easy, but I can only imagine the grumpy flames I'd get
for that. I think this is the course of action I'm going to take:
*) move the internal buffer from struct memcache to struct
memcache_server. This will use more memory, but I can't think of an
alternative that would let me safely handle multi-get's that span
*) create a set of IO routines that wrap select(2), poll(2), and
kqueue(2) ala thttpd(8).
*) Add a void *misc struct member to struct memcache_re(q|s).
*) In mcm_get(), iterate through each key and find its server and group
the keys into a new request that is dedicated for a specific server.
Essentially this splices out and creates a struct memcache_req for each
*) Hand the request objects to a routine that iterates through the
requests using non-blocking IO. Each request object has a state added
to its flags to signify its state (read/write). At the end of each
iteration through the list of requests being processed, it hasn't
finished its IO (reading or writing), register the file descriptor with
the IO routines above and wait to read/write more data. Repeat/rinse.
*) Once all IO has been completed, begin reassembling data in the
correct order using the misc pointer for the internally used
memcache_res that were dispatched to each of the servers.
The advantages to this are:
*) Under ideal circumstances, it iterates through each request only
twice. Once when creating the appropriate struct memcache_re(q|s)'s,
and once when assigning the values back to the appropriate memcache_res
object that gets returned to the user via struct memcache_req.
*) If there is only one server, blocking IO can be used.
*) Reassembly is O(N).
*) Waiting on descriptors is abstracted and lets folks who have
kqueue(2) available to them use kqueue(2) and removes the use of
select(2) unless you need it.
And the disadvantages are:
*) Buffer size goes from only one per struct memcache object to one for
every struct memcache_server object.
All things considered, I can't think of a better way to go that
satisfies everyone's needs. Unless someone can sport a good reason,
I'll make the IO selection routines configure time dependent (ie, pmk
-e use_(kqueue|poll|select)). -sc
PS I probably shouldn't bury this here at the bottom of a technical
email, but I figure it's the gearheads who'd read this that would be
most interested. I'm looking to pick up roughly 20-40hrs of contract
work in the month of January in the event that anyone has any memcache
(C, PHP, Ruby, Perl, Java, etc. or general talks/lectures) or
PostgreSQL needs. Email me offline if you're interested.
More information about the memcached