Client interoperability, server selection.

Tomash Brechko tomash.brechko at gmail.com
Sun Mar 9 13:21:51 UTC 2008


On Sun, Mar 09, 2008 at 00:40:32 +0100, Henrik Schröder wrote:
> When placing the servers on the continuum, you have to choose how many times
> each server appears, which hashing algorithm you use to generate the places,
> and which scheme you use to get each successive point for a given server.

Indeed, when we say the client implements the Ketama consistent
hashing algorithm, we mostly mean the idea of mapping servers onto
points on the ring, but not necessarily the same implementation as in
libketama.

While developing Cache::Memcached::Fast we considered using libketama,
but it does too many things (I/O, shared memory, synchronization),
plus it uses MD5 hash, which I think is an overkill: there's no
requirement for server name hash to be irreversible, plus MD5 is 16
bytes when only 4 bytes are needed.  It's not a question of
performance of course, as server additions/deletions are rare, but
more a matter of minimalistic approach.

Here's what Cache::Memcached::Fast does instead:

 - each server is mapped onto <ketama_points> * <weight> rounded to
   nearest integer (both are parameters, server weight may be rational).

 - hash function is:

     hash = crc32(host);
     hash = crc32_add(hash, "\0");
     hash = crc32_add(hash, port); // port is a decimal string
     for (i = 0; i < point_count; ++i)
       {
         buf[0] = i & 0xff;
         buf[1] = (i >> 8) & 0xff;
         buf[2] = (i >> 16) & 0xff;
         buf[3] = (i >> 24) & 0xff;
         point = crc32_add(hash, buf);
         ...
       }

   i.e. each point is a CRC32 hash of the concatenation of host name,
   null byte, port number in decimal, and four bytes of the sequential
   point index, least significant byte first.

 - keys are hashed with CRC32.  The key is mapped to the server with
   the nearest point that is not less than CRC32(key) (or wraps around
   to the server with min POINT(server), when CRC32(key) is greater
   than any point).

 - servers are added in order.  When there's already the server with
   the same POINT(server), the new one is added _after_ the old ones.
   When looking for the server for a given key, and there are several
   servers with the same POINT(server), we take the _first_ one (I
   actually fixed this step in Cache::Memcached::Fast only today).

The implementation can be viewed at

 http://git.openhack.ru/?p=Cache-Memcached-Fast.git;a=blob;f=src/dispatch_key.c

functions ketama_crc32_add_server() and ketama_crc32_get_server().


I'm sure not everyone will like it if I say "let's take my approach as
the standard one".  However, if your client is already compatible with
Cache::Memcached (which is

  server = servers[((CRC32(key) >> 16) & 0x7fff) % total_servers];

), then you already have implemented CRC32 (and also have the notion
of server weights).  So there's no point in introducing yet another
hash function, and the above scheme looks like a fine approach.  If we
are to agree on any standard, I wish it will be close to the above.


-- 
   Tomash Brechko


More information about the memcached mailing list