A C client for memcached
Sean Chittenden
sean at chittenden.org
Wed Oct 27 00:21:42 PDT 2004
> Wow, nice surprise.
Yeah, I'd been feeling like a dick for sitting on this code. It'd be a
crime to have someone else duplicate the work 'cause I was too lazy to
send an email.
> Are you going to host this somewhere that I can link to from the
> clients
> page, or would you rather I host it?
I'd like to see it incorporated into the tree that way other language
authors can wrap them for other various language APIs (ex: the PHP api,
which doesn't support multiple servers... I could also write a Ruby
wrapper with relative ease if there was interest). I'm not familiar
with the autofuck tool chain... could you put together the necessary
glue to have it built and installed by default? I could be of help if
you were using pmk, but I've managed to largely avoid auto* and its
headaches.
Speaking of APIs, there is a rather ugly pimple in the memcached
system: propagation of a server list is a PITA and always ends up
getting hand rolled for each installation/site. As things stand, each
client has to grab their own list and maintain it. For persistent
running programs that only update their list of servers on startup...
it's problematic. What I propose doing is adding a few things to
memcached/libmemcache.
1) Change libmemcache so that it uses shared memory for its list of
servers. Then, when an application starts up, it defines an
application/key space domain (the key for the shared memory segment),
which it uses to grab a unique set of memcached servers that are
available for that domain. This is handy for folks who have different
memcached instances for different applications. Right now, when you
create a new memcache object, you call mc_new(void). I'd like to see
this become a wrapper around mc_new2(const char *domain), where
mc_new(void) calls mc_new2("default"). This would preserve API, but,
would allow all kinds of useful things to happen with a memcached admin
tool, which would manipulate the various server lists for the available
domains. The other pieces in shared memory would include a count of
the number of servers, and a u_int64_t version number to version the
server list.
For users who don't have shared memory or don't want it, mc_new() would
call mc_new_private(), which would be the same as calling
mc_new2("private"). A private server mapping is specific to the
memcache instance (identical to the current behavior of the C API).
2) Add a memcached administration program that manages the server lists
that reside in shared memory... say, mcadmin(8). If a memcache client
is using a shared list, someone should be able to execute, `mcadmin
--domain myapp add new_memcache_host:11211` and instantly have all
libmemcache users take advantage of the memcache instance. A delete
command should be available as well. Hell, why not have a generic
memcache(1) program that can be integrated with shell scripts (`mclient
get key`, or `some_cmd | mclient set foo`).
3) A "clients" command. It prints out a list of the client IP
addresses. This would primarily be used by the mcadmin(8) program,
which, when run on any client that has a server list, would get the
list of servers for a domain, connect to each server and issue the
"clients", and record the list of consumers. From this list, it should
be possible to have mcadmin(8) run around to the various servers (some
kerberized service, etc.) and run the appropriate mcadmin command.
A better way to do #3 would be to have the server return something like
SERVER_UPDATE right before an END command. Here is an example:
get foo\r\n
bar\r\n
SERVER_UPDATE <domain> <version>\r\n
END\r\n
SERVERS <domain> <version>\r\n
SERVER mc1.example.com:11211\r\n
SERVER mc2.example.com:11211\r\n
SERVER mc4.example.com:11211\r\n
END
Or, in the event of a cache miss:
get non-exist\r\n
SERVER_UPDATE <domain> <version>\r\n
...
END\r\n
Where <version> represents the version number for its server list
stored in shared memory (date derived stored in a u_int64_t...
something like 200410260000000, which would allow for 10 million
updates in a day). If the version number given by the server is newer
than the version number the client already has loaded, it reads the
server list from the server, updates the shared map, and proceeds with
its queries. All clients connected to a memcached server would receive
this command, but only one on a given host should update the shared
map. The only problem with this is that the memcached server would
send every server in the list. Not a huge issue, but, still an issue.
It's not like a routing where one can justify the overhead of adding
incremental changes support.
4) A SERVER command that way a client can propagate changes that it
learns about. For example:
server <domain> <version>\r\n
delete mc6.example.com:11211\r\n
add mc5.example.com:11211\r\n
END\r\n
Then, if the version is newer than version stored on the server, it
adds the listed servers to its server map and announces the changes to
its clients. Having a client such as mcadmin(8) query a memcached
server, get a list of servers, then issue updates to all servers is
more appealing to me than having servers aware of their neighbors. It
sure is tempting to have the memcached cluster aware of its other
servers and propagating changes that way, but that seems like too big
of a logistics headache to me (seems like the same headache that
routing software is plagued with). Having a single mcadmin(8) program
connect to all servers seems like a better way to go. Simple is best.
I know I can get this information from sockstat(1)/netstat(1), but
sockstat(1)/netstat(1) doesn't exist everywhere and I just assume
integrate this simple functionality into the base that way it'll
propagate very quickly.
5) A server <domain> list\r\n command. When a client first connects to
the memcached server, it *should* (doesn't have to) issue this command
to get an updated server list. With long running connections, this
overhead seems negligible to me and easily justified.
Yeah, I know these aren't small changes and would probably require a
major version bump, but, I think it'd be worth it. :)
> As for your other comments, I'll look into them.
Thanks. If you have any questions, please let me know. I'm going to
knock out an mclient(1) program as a start, then go about adding the
above functionality unless I hear some kind of overwhelming objection.
Right now I have to have each client maintain its own server list and
now that I've got libmemcache embedded in PostgreSQL, postfix, dbmail,
and a few other places, maintaining, distributing, and notifying long
running processes of those changes is a *huge* pain in the ass. Having
it built into the protocol/system would be exceedingly convenient for
developers and admins who want to bring machines up and down with
little notice. -sc
--
Sean Chittenden
More information about the memcached
mailing list