memcached hiccups?
Just Marc
marc at corky.net
Wed Apr 11 18:26:35 UTC 2007
You're most likely out of kernel memory allocated to TCP work. Tweak
/proc/sys/net/ipv4/tcp_*mem*
I'd also suggest you run netstat -s and look at the TCP-specific area,
especially at the failure counters.
Enjoy
>
> Managed to tweak the kernel logging settings to not suppress and got
> the messages:
>
> Apr 11 10:02:43 poseidon kernel: TCP: drop open request from
> x.x.x.x/56575
>
> Certainly seems like a kernel problem, and one we haven't seen before.
> I'm afraid this is out of my depth - why would the kernel be dropping
> TCP open requests?
>
> FYI, this is happening on kernels:
>
> 2.6.9-42.0.3.ELsmp
> 2.6.9-5.ELsmp
>
>
> Are there kernel tuneables I should be looking at?
>
> Thanks,
>
> Don
>
>
> Don MacAskill wrote:
>>
>> That's what I've suspected, too, so I've been digging, but haven't
>> found anything that looks suspicious. :(
>>
>> Don
>>
>>
>> marc at corky.net wrote:
>>> I'm only commenting re the kernel message. You should dig up the
>>> original message in your /var/log log files.... It's more likely a
>>> kernel config issue rather than a memcached issue. I'm guessing
>>> it's some sort of out of socket memory condition. Check it out and
>>> let us know.
>>>
>>> Marc
>>>
>>>
>>>>
>>>> I've got something strange going on, and can't seem to figure it
>>>> out. One of our memcached boxes will periodically choke on memcached.
>>>>
>>>> Periodically usually means ~6-8 days or so, but there's some
>>>> variation.
>>>>
>>>> Choke usually means it fails sock_to_host from lots of clients, so
>>>> we take it out of rotation and then check it later and put it back in.
>>>>
>>>> It's memcached-1.2.1 from the rpm in dag's repository, using
>>>> libevent 1.3b from that same repo.
>>>>
>>>> I'm running 4 instances on this box (4 Opteron cores, haven't had
>>>> the guts to try the multithreaded one yet), all of which fail
>>>> simultaneously, and one of the four instances says: "Failed to
>>>> write, and not due to blocking: Connection reset by peer". The
>>>> others don't say anything (only -v).
>>>>
>>>> Finally, the syslog fills up with lots and lots of these:
>>>>
>>>> "Mar 26 09:50:20 poseidon kernel: printk: 13382 messages suppressed."
>>>>
>>>> But there's no indication of what message it was that was suppressed.
>>>>
>>>> There's nothing else running on the box, there's no swap, it's not
>>>> running out of RAM, etc.
>>>>
>>>> Has anyone else seen this or anything like it?
>>>>
>>>> Thanks,
>>>>
>>>> Don
>>>>
>>>>
>>>
>>>
>>
>
More information about the memcached
mailing list