ketama - a consistent hashing algo for memcache clients

Pierre Demartines pierre at demartines.com
Tue Apr 10 23:20:07 UTC 2007


FYI, we found the Fowler-Noll-Vo hashes to be very quick to compute, and
yet having beautiful properties in terms of dispersion.

FWIW, here is a simple Java implementation that computes a 64-bit
hashcode for a class that extends CharSequence:

package com.yourco.util;

/**
 * Enables a CharSequence with the capability to produce a
Fowler-Noll-Vo (FNV) 64-bit hashcode.<p>
 * 
 * FNV hashes are designed to be fast while maintaining a low collision
rate.
 * The FNV speed allows one to quickly hash lots of data while
maintaining a
 * reasonable collision rate. The high dispersion of the FNV hashes
makes them
 * well suited for hashing nearly identical strings such as URLs,
hostnames,
 * filenames, text, IP addresses, etc.
 * 
 * Usage: add "extends CharSequenceHash64" to a class that implements
CharSequence,
 * and that class will have hashCode64().<p>
 * 
 * You can daisy-chain several hashCode calculations like this:<p>
 * <pre>long hashOfSeveral =
a.hashCode64(b.hashCode64(c.hashCode64()));</pre><p>
 * or, alternatively, like that:
 * <pre>
 * long hashOfSeveral = CharSequenceHash64.FNV1_64_INIT;
 * for (MyCharSequence x : collectionOfMyCharSequences) {
 *    hashOfSeveral = x.hashCode64(hashOfSeveral);
 * }</pre><p>

 * @see http://www.isthe.com/chongo/tech/comp/fnv/  and
 * @see http://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash<p>
 * 
 * @author pdemartines at quantcast.com
 */
public abstract class CharSequenceHash64 implements CharSequence {
	public  static final long FNV1_64_INIT = 0xcbf29ce484222325L;
	private static final long FNV_64_PRIME = 0x100000001b3L;
	/**
	 * @param initialValue -- optional: initial value
	 * @return a 64-bit FNV hash called fnv164
	 */
	public long hashCode64(long initialValue) {
		long hval = initialValue;
		int len = length();
		for (int i=0; i<len; i++) {
			hval *= FNV_64_PRIME;
			hval ^= (long)charAt(i);
		}

		return hval;
	}

	public long hashCode64() {
		return hashCode64(FNV1_64_INIT);
	}
}


Here is a junit test for it, that shows how it can be used:

package com.yourco.util;

import java.util.HashMap;

import junit.framework.TestCase;

public class CharSequenceHash64_T extends TestCase {

	public class MyCharSeq extends CharSequenceHash64 implements
CharSequence {
		private char[] buffer = null;
		private MyCharSeq() {}
		private MyCharSeq(int length) {
			buffer = new char[length];
		}
		public MyCharSeq(String s) {
			buffer = s.toCharArray();
		}
		public int length() {
			return (buffer == null ? 0 : buffer.length);
		}
		public char charAt(int index) {
			if (index < length()) {
				return buffer[index];
			} else {
				throw new IndexOutOfBoundsException();
			}
		}
		public String toString() {
			if (buffer == null) return "";
			return new String(buffer);
		}
		public CharSequence subSequence(int start, int end) {
			if (buffer == null) throw new IndexOutOfBoundsException();
			int length = end-start;
			MyCharSeq sub = new MyCharSeq(length);
			System.arraycopy(buffer, start, sub.buffer, 0, end-start);
			return sub;
		}
	}
    
    // FNV's hashCode64
    // Compare this function's value against known values (calculated
using bona-fide implementations of FNV164)
    public void testHashCode64() {
    	HashMap<String,Long> value = new HashMap<String,Long>();
    	value.put("", new Long(0xcbf29ce484222325L));
    	value.put(" ", new Long(0xaf63bd4c8601b7ffL));
    	value.put("hello world!", new Long(0x58735284b97b86bcL));
    	value.put("Lorem ipsum dolor sit amet, consectetuer adipiscing
elit.", new Long(0x536c9cdee87c054aL));
    	value.put("wd:com.google", new Long(0xcf4e7986071b08f8L));
    	value.put("wd:com.google ", new Long(0x5d6176be12f03d48L));
    	for (String s:value.keySet()) {
    		long h64 = value.get(s);
    		assertEquals("\""+s+"\".hashCode64()", h64, new
MyCharSeq(s).hashCode64());
    	}
    }
}


Enjoy,

~Pierre


On Tue, 2007-04-10 at 23:01 +0100, Richard Jones wrote:
> > 	This is a very interesting idea, and I'd like to add a consistent
> > hash driver to my java interface, but there's a bit of gap between
> > your writeup and what the implementation.
> 
> Check the init() method - that builds up the appropriate data structure (a 
> TreeMap called buckets iirc). You can use any hashing mechanism you want 
> instead of the md5 thing, that just seemed a convenient way to generate a 
> bunch of hashes. If you change it you won't be compatible with other 
> implementations - not a problem if you are exclusively using java tho.
> 
> Once you've build the data structure using the servers list, that is pretty 
> much it in java, just check how keys are mapped to servers in findPointFor().
> 
> RJ
> 
> 
> >
> > 	In the java interface, all I see is that you've added a new hash
> > type that's based off an MD5.  Is that actually an important detail?
> > It seems like you should be able to use any hash that produces an
> > even 32-bits.
> >
> > 	Is it OK if I use your C implementation as a specification?
> 



More information about the memcached mailing list