Memcache connection errors

Steven Grimm sgrimm at facebook.com
Wed May 24 13:44:00 UTC 2006


I can tell you that when we've seen failures to connect to memcached, 
the problem has always turned out to be a network issue -- most often a 
flaky switch or loose cable. But in this case it sounds like a simple 
capacity problem.

How much data do you run through memcached? You say you moved from a 
gigabit network to a 100Mbps network; that's an order of magnitude less 
available bandwidth for memcached traffic and you may simply be 
saturating the network. You could be saturating your switch, rather than 
maxing out any one machine's capacity; that is even more likely if 
you're sharing a switch with other customers at your hosting provider, 
who might have bursts in traffic that overwhelm the switch without you 
knowing about it.

Look at your memcached stats and see how fast the "bytes_in" and 
"bytes_out" numbers change. If you're pushing a few tens of megabits a 
second, then periodic spikes in traffic could easily overwhelm your network.

In any event, if you're seeing 7ms ping times on your local switched 
network, something is definitely wrong there. You should never see that 
on a network that's not pushing up against its capacity limits.

-Steve


Kamran Nisar wrote:
>
> got zero tx/rx errors...
>
>
> even the ping returned 0% packet loss, though there was sum random 
> sharp jumps in repsonse time



More information about the memcached mailing list