ketama - a consistent hashing algo for memcache clients

Pierre Demartines pierre at demartines.com
Wed Apr 11 01:08:59 UTC 2007


It seems that the algorithm is fairly widespread and open. Please check
Landon Curt Noll's excellent website on the subject:
http://www.isthe.com/chongo/tech/comp/fnv

As to my java implementation, you may use & modify as you wish.  I guess
it would be good to keep the mention to Noll's url that is in the
comment.

Cheers,

~Pierre


On Tue, 2007-04-10 at 17:52 -0700, Dustin Sallings wrote:
> 
> 
> Do you have a license for this?  Seems like something I could use as a
> decent hash provider.
> 
> On Apr 10, 2007, at 16:20 , Pierre Demartines wrote:
> 
> > FYI, we found the Fowler-Noll-Vo hashes to be very quick to compute,
> > and
> > yet having beautiful properties in terms of dispersion.
> > 
> > 
> > FWIW, here is a simple Java implementation that computes a 64-bit
> > hashcode for a class that extends CharSequence:
> > 
> > 
> > package com.yourco.util;
> > 
> > 
> > /**
> >  * Enables a CharSequence with the capability to produce a
> > Fowler-Noll-Vo (FNV) 64-bit hashcode.<p>
> >  * 
> >  * FNV hashes are designed to be fast while maintaining a low
> > collision
> > rate.
> >  * The FNV speed allows one to quickly hash lots of data while
> > maintaining a
> >  * reasonable collision rate. The high dispersion of the FNV hashes
> > makes them
> >  * well suited for hashing nearly identical strings such as URLs,
> > hostnames,
> >  * filenames, text, IP addresses, etc.
> >  * 
> >  * Usage: add "extends CharSequenceHash64" to a class that
> > implements
> > CharSequence,
> >  * and that class will have hashCode64().<p>
> >  * 
> >  * You can daisy-chain several hashCode calculations like this:<p>
> >  * <pre>long hashOfSeveral =
> > a.hashCode64(b.hashCode64(c.hashCode64()));</pre><p>
> >  * or, alternatively, like that:
> >  * <pre>
> >  * long hashOfSeveral = CharSequenceHash64.FNV1_64_INIT;
> >  * for (MyCharSequence x : collectionOfMyCharSequences) {
> >  *    hashOfSeveral = x.hashCode64(hashOfSeveral);
> >  * }</pre><p>
> > 
> > 
> >  * @see http://www.isthe.com/chongo/tech/comp/fnv/  and
> >  * @see http://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash<p>
> >  * 
> >  * @author pdemartines at quantcast.com
> >  */
> > public abstract class CharSequenceHash64 implements CharSequence {
> > public  static final long FNV1_64_INIT = 0xcbf29ce484222325L;
> > private static final long FNV_64_PRIME = 0x100000001b3L;
> > /**
> > * @param initialValue -- optional: initial value
> > * @return a 64-bit FNV hash called fnv164
> > */
> > public long hashCode64(long initialValue) {
> > long hval = initialValue;
> > int len = length();
> > for (int i=0; i<len; i++) {
> > hval *= FNV_64_PRIME;
> > hval ^= (long)charAt(i);
> > }
> > 
> > 
> > return hval;
> > }
> > 
> > 
> > public long hashCode64() {
> > return hashCode64(FNV1_64_INIT);
> > }
> > }
> > 
> > 
> > 
> > 
> > Here is a junit test for it, that shows how it can be used:
> > 
> > 
> > package com.yourco.util;
> > 
> > 
> > import java.util.HashMap;
> > 
> > 
> > import junit.framework.TestCase;
> > 
> > 
> > public class CharSequenceHash64_T extends TestCase {
> > 
> > 
> > public class MyCharSeq extends CharSequenceHash64 implements
> > CharSequence {
> > private char[] buffer = null;
> > private MyCharSeq() {}
> > private MyCharSeq(int length) {
> > buffer = new char[length];
> > }
> > public MyCharSeq(String s) {
> > buffer = s.toCharArray();
> > }
> > public int length() {
> > return (buffer == null ? 0 : buffer.length);
> > }
> > public char charAt(int index) {
> > if (index < length()) {
> > return buffer[index];
> > } else {
> > throw new IndexOutOfBoundsException();
> > }
> > }
> > public String toString() {
> > if (buffer == null) return "";
> > return new String(buffer);
> > }
> > public CharSequence subSequence(int start, int end) {
> > if (buffer == null) throw new IndexOutOfBoundsException();
> > int length = end-start;
> > MyCharSeq sub = new MyCharSeq(length);
> > System.arraycopy(buffer, start, sub.buffer, 0, end-start);
> > return sub;
> > }
> > }
> > 
> > 
> >     // FNV's hashCode64
> >     // Compare this function's value against known values
> > (calculated
> > using bona-fide implementations of FNV164)
> >     public void testHashCode64() {
> >      HashMap<String,Long> value = new HashMap<String,Long>();
> >      value.put("", new Long(0xcbf29ce484222325L));
> >      value.put(" ", new Long(0xaf63bd4c8601b7ffL));
> >      value.put("hello world!", new Long(0x58735284b97b86bcL));
> >      value.put("Lorem ipsum dolor sit amet, consectetuer adipiscing
> > elit.", new Long(0x536c9cdee87c054aL));
> >      value.put("wd:com.google", new Long(0xcf4e7986071b08f8L));
> >      value.put("wd:com.google ", new Long(0x5d6176be12f03d48L));
> >      for (String s:value.keySet()) {
> >      long h64 = value.get(s);
> >      assertEquals("\""+s+"\".hashCode64()", h64, new
> > MyCharSeq(s).hashCode64());
> >      }
> >     }
> > }
> > 
> > 
> > 
> > 
> > Enjoy,
> > 
> > 
> > ~Pierre
> > 
> > 
> > 
> > 
> > On Tue, 2007-04-10 at 23:01 +0100, Richard Jones wrote:
> > > > This is a very interesting idea, and I'd like to add a
> > > > consistent
> > > > hash driver to my java interface, but there's a bit of gap
> > > > between
> > > > your writeup and what the implementation.
> > > 
> > > 
> > > Check the init() method - that builds up the appropriate data
> > > structure (a 
> > > TreeMap called buckets iirc). You can use any hashing mechanism
> > > you want 
> > > instead of the md5 thing, that just seemed a convenient way to
> > > generate a 
> > > bunch of hashes. If you change it you won't be compatible with
> > > other 
> > > implementations - not a problem if you are exclusively using java
> > > tho.
> > > 
> > > 
> > > Once you've build the data structure using the servers list, that
> > > is pretty 
> > > much it in java, just check how keys are mapped to servers in
> > > findPointFor().
> > > 
> > > 
> > > RJ
> > > 
> > > 
> > > 
> > > 
> > > > 
> > > > 
> > > > In the java interface, all I see is that you've added a new hash
> > > > type that's based off an MD5.  Is that actually an important
> > > > detail?
> > > > It seems like you should be able to use any hash that produces
> > > > an
> > > > even 32-bits.
> > > > 
> > > > 
> > > > Is it OK if I use your C implementation as a specification?
> > > 
> > > 
> > 
> > 
> 
> -- 
> Dustin Sallings
> 
> 
> 



More information about the memcached mailing list