ketama consistent hashing update
sgrimm at facebook.com
Wed Aug 8 10:47:01 UTC 2007
Brad Fitzpatrick wrote:
> -- do we pre-compute 'n' points like my perl client?
> (note that this can be done lazily too, only evaluating each of
> 1024 positions the first time they're accessed)
I'd like to request that we standardize on a much bigger number than
1024. If you have a ton of memcached instances, "n" doesn't have to be
very high before a space of 1024 values starts to get pretty crowded, if
not completely full.
Speaking of which, any spec should clarify what happens when there's a
collision between two targets. I would prefer it to be deterministic
based on the targets in question, rather than based on insertion order;
then you are guaranteed to get the same results in a client which has
added new targets incrementally over time as you do in one that started
out with the complete list in some undefined order.
I am not completely convinced there's much value to limiting the space
to some arbitrarily small range like 1024 in the first place. Given that
this is on the client side (where the request rate is relatively low), a
binary search or n-ary tree traversal is likely to be indistinguishable
from an array lookup in performance terms given the likely depth of the
tree, and if you have a full 32 bits worth of number space, collisions
become very unlikely. Given that the array in question is just a cache
of the results of searching some underlying data structure -- hence the
fact that it can be populated lazily -- IMO it should have a
demonstrable performance benefit to be worth the extra layer of complexity.
The complexity, BTW, is not just in searching. The array really is a
cache, which means you have to be careful to invalidate it at the right
times. When you add a new target in your lazy-evaluated implementation,
are you quite certain the code will find it given that previous targets
may already have been cached in the array? Et cetera.
More information about the memcached