ketama consistent hashing update
Brad Fitzpatrick
brad at danga.com
Wed Aug 8 10:15:27 UTC 2007
Richard,
I was actually looking at ketama earlier today, after a lunch discussion
with coworkers about finally standardizing the consistent hashing
algorithms in the memcached community.
When I looked at the code earlier today, I noticed you weren't doing the
optimization I did with my Set::ConsistentHash implementation in Perl:
after building the continuum, evaluate get_point() at, say, 1024 points,
yielding 1024 pre-computed answers in an array. And then at runtime,
picking a server is a lot faster: it's just a regular hash (crc32, md4,
whatever) on the key, then an array access. In my profiling, this made a
big difference.
I like that you're going forward with ketama, and that there's both a C
and Java implementation, but I'd like to have a written spec first, not
just a de-facto standard implementation. (even if "spec" is very informal,
but in memcached's doc/ directory at least) I also want the spec to
clarify:
-- do we pre-compute 'n' points like my perl client?
(note that this can be done lazily too, only evaluating each of
1024 positions the first time they're accessed)
-- which digest algorithm(s) for the points and/or array lookup?
-- how many points are put on the continuum for each target?
+
-- what is the input to the digest algorithm for points 1-n?
e.g.:
"127.0.0.1:11211-1"
"127.0.0.1:11211-2"
"127.0.0.1:11211-3"
....
"127.0.0.1:11211-<n>"
etc.
In the same way agreeing on crc32() was beneficial to all of us before,
and interlanguage interop, I think it's important to get this right too.
Besides you and me, who else has done consistent hashing implementations
here that we should involve?
- Brad
On Wed, 8 Aug 2007, Richard Jones wrote:
> ketama is a consistent hashing library we use in our memcached clients,
> mentioned on this list a few months ago. I've updated our public svn:
>
> * libketama changed to BSD license
> * minor tweaks/optimisations
> * 'make test' has a basic regression test and hashing distribution check
> * added python_ketama (from blogg.se folks)
>
> svn://svn.audioscrobbler.net/misc/ketama
>
> There is also a (slightly outdated) patch in the patches dir to replace md5
> hashing with FNV-1a, which is ~60% faster, which i am yet to merge properly.
>
> --
> Richard Jones
> Last.fm Ltd. | http://www.last.fm/
> Office: +44 (0) 207 780 7080
>
>
More information about the memcached
mailing list