Dead connections aren't removed from pool in java memcached client?

Mon Aug 6 17:08:45 UTC 2007

Hey memcached folks,

We've encountered an odd production issue with the java memcached  
client.  This is occuring in a memcached cluster of 30 nodes running  
server version 1.2.1, with java client version 1.5.1.  I realize  
neither of these are the latest and greatest versions of the  
respective software, but I see nothing as of yet in the change logs  
to indicate that our problem would be solved by upgrading.

The issue is that when one or more server nodes go down and then  
later come back up, dead connections seem to persist in the client  
connection pools for an indefinite period.  We are aware of this  
because we can see the memcached client SockIOPool class logging the  
following error on some percentage of incoming requests (where "foo"  
is just the output of the socket's toString method and "bar" is the  
host).

++++ socket in avail pool is not connected: foo for host: bar

The errors seem to (very very) slowly decrease over time, but a lot  
of them persist after 24 hours and the only remedy is to restart the  
application JVMs running the memcached client.  Note that while this  
error is occuring, the memcached node that died and then restarted  
seems to be getting a relatively normal volume of traffic.

These are the settings we currently use on SockIOPool:

pool.setMinConn(5);
pool.setMaxConn(50);
pool.setInitConn(5);
pool.setSocketTO(1000);
pool.setSocketConnectTO(100);
pool.setFailover(false);
pool.setFailback(true);
pool.setAliveCheck(false);

Any help would be greatly appreciated.

Eli Bingham
Senior Engineer
Pandora Media, Inc.