python-memcached server selection code (long)

Thu Oct 25 13:09:18 UTC 2007

I do not see "sequence of identical selections" being
a problem.  It is like saying you flip a coin 10 times
and getting 5 heads in a row means the coin is broken.

I did not get your numbers.  Why not do a test with N
random keys?  If about N/2 go to the 2 servers then it
is OK.  No matter how many happen "in a row".

If your app does 1000 key lookups while 1 of 2 hosts are
down, you would expect about 500 to fail (given the current
memcached availability model).  What does it matter if
500 in a row fail, or every other one fails?

With any hashing algorithm, it is always possible your
application uses keys that cause the hash to be unfair.

Brian Beuning

Quote of the day
Knuth Volume 2, page 5, "The moral of this story is that random numbers
should not be generated with a method chosen at random.  Some theory
should be used."

-----Original Message-----
From: Robert Szefler
To: memcached at lists.danga.com
Cc: development.lists.sensisoft.com
Sent: 10/25/2007 5:47 AM
Subject: python-memcached server selection code (long)

Hi,

we've been testing the python-memcached module along with developing our

(large scale) application and a subtle problem surfaced. Let me explain.

The function that selects the server for a key is Client._get_server. It

performs a kind of "random walk" to select indices of servers stored in 
a table. In the current (1.40) version, this function is essentially 
iterating a formula based on conversion to decimal representation, 
concatenation with a counter in decimal, and CRC32 (which I guess was in

fact supposed to act as a PRNG). What I'm afraid of, is the situation 
when the same server is selected many times in a row. I encountered that

situation for a real-life key in our application and, wondering, how 
often such a situation might happen, written a simple program to test 
it. I've tested the server selection code for keys generated as 
consecutive integers written in decimal ("1", "2", ...) and for 
selection between two servers. Here go the results:

# trials : m : dens
5 : 7 : ~16
7 : 116 : ~64
10 : 328 : ~520 (512?)
13 : 1375 : ~4170 (4096?)
16 : 10617 : ~33313 (32768?)
18 : 274622 : ~149126 (131072?)

What the columns mean:

# trials - the number of rounds the server selection was run 
(corresponds to Client._SERVER_RETRIES)
m - the first (smallest) value of a key that causes all the rounds of 
server selection to return identical value (either all 0's or all 1's)
dens - (experimental) inverse density of such keys (that give several 
identical server selections), generated by testing the code for several 
million key values.

 From what can be seen here, the density seems to follow the expected 
2^(-#trials+1) formula (the +1 is here because two results are 
considered "bad": [0,0,...0] and [1,1,...1]). So the selection function 
seems to behave genuinely randomly, as I think was expected. Which is, 
ultimately, its doom.

The problem is that even with quite many trials (for example, the 
default 10) the code is quite probable to generate a sequence of 
identical selections. In fact, for the default 10 rounds, on average one

in 512 keys will cause always the same server to be selected (assuming 
there are two of them). This would be especially bad when one of two 
memcached servers goes down - then, every 1 out of 1024 keys (on 
average) would *never* be cached (read/written) because the client would

be trying hard to put it on a defunct server all the time.

Designing a good sequence generator to fix this problem is marginally 
non-trivial. The only sensible way I imagine to ensure both a certain 
degree of randomness and guarantee that every server will eventually be 
tested (assuming that the number of servers is <= number of trials) is 
to use random permutations. For this purpose, for example the 
Fisher-Yates algorithm might be used and I think this is the way to go 
here. A problem appears when the random permutation of length, say, 3 is

exhausted, because we have, say, 5 trials to do. There are two 
possibilities: generate another permutation or reuse the last one. Both 
approaches have their merits which I will not discuss because this mail 
is already getting longish. I personally would recommend reusing the 
same permutation. This version I have coded and included here as a
patch.

Please feel free to comment on this issue; I think it is quite serious 
and python-memcached should be patched. What is the situation in other 
client libraries?

-- 
Robert Szefler
Sensisoft

 <<memcache.patch>>