Client interoperability, server selection.

Wed Mar 12 13:41:03 UTC 2008

On Tue, Mar 11, 2008 at 22:17:51 +0300, Tomash Brechko wrote:
> So, can we all agree on fixed (user controllable) number of points per
> server,

The last thing to consider is whether or not to hash the namespace
prefix along with the key.  In Cache::Memcached compatibility mode
namespace prefix is not hashed, but I think for compatible Ketama
implementation it should be hashed.  Not hashing the prefix is
counter-intuitive: the same "some/key" may end up being hosted on
different servers, because one client could use namespace: "some",
key: "key", and the other namespace: "", key: "some/key".  My guess is
that most users expect this to be interchangeable (and this is also a
requirement for interoperability with clients that do not have the
notion of namespace prefix).

The reference implementation may be found at

http://git.openhack.ru/?p=Cache-Memcached-Fast.git;a=blob;f=src/dispatch_key.c;hb=ketama-compat

Number of points for a given server is computed on line 160.  Hashing
of "server:port<4-byte little-endian seqno starting from zero>"
happens on lines 166-185 ("host" and "port" are exactly as specified
by the user).  Fast-forward on insert collision is on lines 208-209.
Namespace prefix hashing happens on line 273, and requested key
hashing is on line 237.  Binary search that is used both to find the
position for new points on server insert, and to find a server for a
given key, is on lines 48-83.  In particular, rewind to the first
server among the run of equal points is on lines 71-72 (note that this
code is executed very rarely).  Wraparound is on lines 79-80 (on
insert we detect wraparound and insert to the end, lines 195-198).

FNV-1a hash function is implemented in a separate file (because
shouldn't be copyrighted),

http://git.openhack.ru/?p=Cache-Memcached-Fast.git;a=blob;f=src/compute_fnv1a.h;hb=ketama-compat

I'm ready to release the code, but to call it "compatible Ketama
implementation" there ought to be more than one client implementing
it.  So I ask client authors: please either explicitly express your
commitment to the above algorithm, or share your doubts/concerns.
When, say, at least 3 more client authors will commit to it, I will
consider this as carved in stone, and release the next version of
C::M::F.  But please do actually try to implement the algorithm before
agreeing on it.  The implementation won't take long, but perhaps you
will encounter some difficulties with it for your language and/or
client architecture.  Thanks in advance! ;)

-- 
   Tomash Brechko