A C client for memcached
Sean Chittenden
sean at chittenden.org
Thu Oct 28 16:46:55 PDT 2004
> -- a Ruby client already exists. (mail ged at danga.com)
Ah, cool. Didn't see it on the list of APIs so figured it didn't exist.
> -- the memcached server list propogation is a very real problem,
> but there's an even worse one which involves re-hashing and
> nodes flapping.
Node flapping == saturation of the memcache daemons with duplicate keys
that potentially have expired data? I was thinking about this problem
too and came at it with a different approach... that makes adding a
server more CPU/network intensive and requires that only one
server_update command (or whatever the command ends up being) issued at
a time before the network settles down.
What I was thinking was once a server has received all of the server
update commands and has built a complete map of the new network, the
server would walk through its keyspace and determine if a given key
belongs on this machine or not. If it doesn't, it can do one of two
things. It can delete the key or connect to the other server and send
the key to its new memcached instance. So, if there are three servers
and a fourth one is added, roughly 33% of the keys on each server would
get redistributed/deleted. The nice thing about this approach is
adding a server wouldn't cause havoc with the database (where all old
keys essentially expire and the database gets trounced).
So, potentially, if a client gets a SERVER_UPDATE command and no data,
then the client retries its query to the new server, and hopfully gets
a hit.
Here's a potential sequence:
*) mcutil(8)/mcadmin(8) connects to a backend and mucks with the server
map.
*) The backend constructs a new copy of the server list, but doesn't
announce its existence.
*) The backend walks through all keys on the server, rehashing them
according to the new server scheme. Keys are either deleted or moved
to their new server.
*) The backend keeps servicing requests for keys that have moved until
all keys have been moved or marked for deletion.
*) If a key has been moved, once the move is complete, its key is
marked for deletion.
*) The list of connections is walked and a SERVER_UPDATE bit is set
that way after the next query, a SERVER_UPDATE command can be sent to
the client.
*) Once all clients, or X% of the clients have been updated, the server
walks through its list of keys that have been marked for deletion and
actually deletes it. Many keys may expire or have been removed during
that time... to be expected.
Problems:
*) If a memcached(8) instance does not have thread support,
memcached(8) would have to block on updates. If it does have thread
support, the file descriptor is handled by a newly spawned thread that
way updates don't lock the whole server while data is being shuffled
around.
*) I've never done pthreads + libevent, but don't think that should be
a problem. pthreads + home grown kqueue(2) loop hasn't resulted in any
problems that I've ever seen.
Questions:
Aside from the fact that you're right about server lists being the
smaller part of the problem, would having the servers store a server
list be accepted as a patch? If not, then I can walk around all
servers with an mcadmin(8) utility that updates the list in shared
memory only. Redistributing keys becomes possible though if the
servers have a list of their peers, which seems very appealing to me.
-sc
--
Sean Chittenden
More information about the memcached
mailing list