python-memcached server selection code (long)

Thu Oct 25 09:47:14 UTC 2007

Hi,

we've been testing the python-memcached module along with developing our 
(large scale) application and a subtle problem surfaced. Let me explain. 
The function that selects the server for a key is Client._get_server. It 
performs a kind of "random walk" to select indices of servers stored in 
a table. In the current (1.40) version, this function is essentially 
iterating a formula based on conversion to decimal representation, 
concatenation with a counter in decimal, and CRC32 (which I guess was in 
fact supposed to act as a PRNG). What I'm afraid of, is the situation 
when the same server is selected many times in a row. I encountered that 
situation for a real-life key in our application and, wondering, how 
often such a situation might happen, written a simple program to test 
it. I've tested the server selection code for keys generated as 
consecutive integers written in decimal ("1", "2", ...) and for 
selection between two servers. Here go the results:

# trials : m : dens
5 : 7 : ~16
7 : 116 : ~64
10 : 328 : ~520 (512?)
13 : 1375 : ~4170 (4096?)
16 : 10617 : ~33313 (32768?)
18 : 274622 : ~149126 (131072?)

What the columns mean:

# trials - the number of rounds the server selection was run 
(corresponds to Client._SERVER_RETRIES)
m - the first (smallest) value of a key that causes all the rounds of 
server selection to return identical value (either all 0's or all 1's)
dens - (experimental) inverse density of such keys (that give several 
identical server selections), generated by testing the code for several 
million key values.

 From what can be seen here, the density seems to follow the expected 
2^(-#trials+1) formula (the +1 is here because two results are 
considered "bad": [0,0,...0] and [1,1,...1]). So the selection function 
seems to behave genuinely randomly, as I think was expected. Which is, 
ultimately, its doom.

The problem is that even with quite many trials (for example, the 
default 10) the code is quite probable to generate a sequence of 
identical selections. In fact, for the default 10 rounds, on average one 
in 512 keys will cause always the same server to be selected (assuming 
there are two of them). This would be especially bad when one of two 
memcached servers goes down - then, every 1 out of 1024 keys (on 
average) would *never* be cached (read/written) because the client would 
be trying hard to put it on a defunct server all the time.

Designing a good sequence generator to fix this problem is marginally 
non-trivial. The only sensible way I imagine to ensure both a certain 
degree of randomness and guarantee that every server will eventually be 
tested (assuming that the number of servers is <= number of trials) is 
to use random permutations. For this purpose, for example the 
Fisher-Yates algorithm might be used and I think this is the way to go 
here. A problem appears when the random permutation of length, say, 3 is 
exhausted, because we have, say, 5 trials to do. There are two 
possibilities: generate another permutation or reuse the last one. Both 
approaches have their merits which I will not discuss because this mail 
is already getting longish. I personally would recommend reusing the 
same permutation. This version I have coded and included here as a patch.

Please feel free to comment on this issue; I think it is quite serious 
and python-memcached should be patched. What is the situation in other 
client libraries?

-- 
Robert Szefler
Sensisoft

-------------- next part --------------
A non-text attachment was scrubbed...
Name: memcache.patch
Type: text/x-patch
Size: 1366 bytes
Desc: not available
Url : http://lists.danga.com/pipermail/memcached/attachments/20071025/88b95c18/memcache-0001.bin