Intermittent memcached pauses
Chris Hondl
chris at imvu.com
Mon Dec 4 07:09:44 UTC 2006
I eventually took the time to track our bug down using iostat at Brad's
suggestion. We had an application bug that occasionally generated a very
large multi-get request (10,000 keys). I discovered the bug when I noticed
the realloc in try_read_network was increasing the buffer to 1MB. The
server got tied up processing the request, which caused various problems as
other machines in the cluster timed the machine out and marked it failed.
A nice to have feature would be a limit to the number of keys memcached
would attempt to process in a single multiple get request.
Chris
On 11/7/06, Brett G. Durrett <brett at imvu.com> wrote:
>
>
> I don't believe we are using either "stats cachedump" or "stats sizes".
> While trying to isolate the problem I disabled all monitoring scripts,
> which only use the "stats" command. It is possible (but highly
> unlikely) that a diagnostic tool was running from another host, but the
> timing of these failures suggest it is something other than a monitoring
> script.
>
> Oddly enough, the behavior seemed to go away when I opened a telnet
> session to the memcached port and issued the 'stats' command... I assume
> this was just a coincidence.
>
> B-
>
> Steven Grimm wrote:
>
> > Are you by any chance using the "stats cachedump" or "stats sizes"
> > commands? Both of those will pause the server while they run; they are
> > intended for occasional debugging use rather than regular production
> use.
> >
> > -Steve
> >
> >
> > Chris Hondl wrote:
> >
> >>
> >> We've had memcached deployed at IMVU for about 9 months now. It
> >> works great, scales easily, never goes down, and totally saved us as
> >> traffic on our site went up and up.
> >>
> >> We are now seeing a problem where one of our memcached hosts
> >> periodically stops handling requests for several seconds. This
> >> particular host has a traffic pattern with about 30k get requests per
> >> second, a small number of set requests, all data values stored are
> >> small (booleans), and 90+% of the gets return a cache miss.
> >>
> >> Watching the machine it looks like it is actually memcached that is
> >> experiencing the delays and using all of the CPU. We tried disabling
> >> all non-critical resources and still saw the pauses. When we do see
> >> CPU get used up for 3-5 seconds, typically the number of processes
> >> waiting for time goes to 2 or 3 and if you look at top, the memcached
> >> process CPU utilization goes from ~30% to 99+%.
> >>
> >> Has anyone else seen behavior like this?
> >>
> >> Chris
> >>
> >>
> >>
> >> Normal:
> >> top - 13:00:00 up 123 days, 14:02, 4 users, load average: 0.36,
> >> 0.52, 0.43
> >> Tasks: 51 total, 3 running, 48 sleeping, 0 stopped, 0 zombie
> >> Cpu(s): 22.3% us, 5.0% sy, 0.0% ni, 68.8% id, 0.0% wa, 0.0% hi,
> >> 4.0% si
> >> Mem: 3969768k total, 260392k used, 3709376k free, 22176k
> buffers
> >> Swap: 6144852k total, 0k used, 6144852k free, 60380k cached
> >>
> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >> * 6465 root 16 0 108m 81m 1380 R 28.0 2.1 7:25.40memcached*
> >> 6829 root 16 0 1876 928 1672 R 0.7 0.0 0:00.11 top
> >> 1 root 16 0 1504 512 1352 S 0.0 0.0 0:13.55 init
> >>
> >> Problem:
> >>
> >> top - 13:00:42 up 123 days, 14:03, 4 users, load average: 0.31,
> >> 0.49, 0.42
> >> Tasks: 52 total, 2 running, 50 sleeping, 0 stopped, 0 zombie
> >> Cpu(s): 98.3% us, 0.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi,
> >> 1.3% si
> >> Mem: 3969768k total, 265136k used, 3704632k free, 22176k
> buffers
> >> Swap: 6144852k total, 0k used, 6144852k free, 60388k cached
> >>
> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >> * 6465 root 20 0 110m 82m 1380 R 99.6 2.1 7:40.27memcached*
> >> 1 root 16 0 1504 512 1352 S 0.0 0.0 0:13.55 init
> >> 2 root 34 19 0 0 0 S 0.0 0.0 0:00.04ksoftirqd/0
> >>
> >>
> >>
> >>
> >> procs -----------memory---------- ---swap-- -----io---- --system--
> >> ----cpu----
> >> r b swpd free buff cache si so bi bo in cs us
> >> sy id wa
> >> 2 0 0 3713424 22176 60332 0 0 0 0 0 0
> >> 22 9 69 0
> >> 1 0 0 3713296 22176 60332 0 0 0 0 0 0
> >> 23 9 68 0
> >> 2 0 0 3713360 22176 60332 0 0 0 0 0 0
> >> 22 10 68 0
> >> 1 0 0 3713312 22176 60332 0 0 0 0 0 0
> >> 24 10 66 0
> >> 1 0 0 3713312 22176 60332 0 0 0 0 0 0
> >> 22 12 66 0
> >> 1 0 0 3713320 22176 60332 0 0 0 0 0 0
> >> 23 11 66 0
> >> 1 0 0 3713320 22176 60332 0 0 0 25 0 0
> >> 23 8 69 0
> >> 2 0 0 3713336 22176 60332 0 0 0 17 0 0
> >> 22 10 68 0
> >> 1 0 0 3713400 22176 60332 0 0 0 0 0 0
> >> 25 10 65 0
> >> 1 0 0 3713400 22176 60336 0 0 0 0 0 0
> >> 24 10 66 0
> >> 1 0 0 3713400 22176 60336 0 0 0 0 0 0
> >> 22 8 70 0
> >> 1 0 0 3713336 22176 60336 0 0 0 0 0 0
> >> 24 8 68 0
> >> 1 0 0 3713272 22176 60336 0 0 0 24 0 0
> >> 22 11 67 0
> >> 1 0 0 3713272 22176 60336 0 0 0 0 0 0
> >> 24 10 66 0
> >> 1 0 0 3713208 22176 60336 0 0 0 0 0 0
> >> 23 9 68 0
> >> 1 0 0 3713208 22176 60336 0 0 0 0 0 0
> >> 20 10 70 0
> >> * 2 0 0 3711032 22176 60336 0 0 0 0 0 0
> >> 73 2 25 0
> >> 2 0 0 3705720 22176 60340 0 0 0 132 0 0
> >> 98 2 0 0
> >> 2 0 0 3703544 22176 60344 0 0 0 0 0 0
> >> 99 1 0 0
> >> 2 0 0 3706488 22176 60344 0 0 0 0 0 0
> >> 97 3 0 0
> >> 2 0 0 3711416 22176 60348 0 0 0 0 0 0
> >> 98 2 0 0
> >> 2 0 0 3709624 22176 60348 0 0 0 0 0 0
> >> 99 1 0 0
> >> 2 0 0 3708848 22176 60348 0 0 0 72 0 0
> >> 98 2 0 0
> >> 2 0 0 3712048 22176 60348 0 0 0 0 0 0
> >> 87 3 10 0*
> >> 1 0 0 3712048 22176 60348 0 0 0 0 0 0
> >> 16 2 82 0
> >> 1 0 0 3712048 22176 60348 0 0 0 0 0 0
> >> 15 1 84 0
> >> 1 0 0 3712048 22176 60348 0 0 0 0 0 0
> >> 15 3 82 0
> >> 1 0 0 3712048 22176 60348 0 0 0 54 0 0
> >> 15 2 83 0
> >> 1 0 0 3712048 22176 60348 0 0 0 0 0 0
> >> 15 2 83 0
> >> 1 0 0 3712056 22176 60348 0 0 0 0 0 0
> >> 15 2 83 0
> >> 1 0 0 3712056 22176 60348 0 0 0 0 0 0
> >> 17 2 81 0
> >> 1 0 0 3712056 22176 60348 0 0 0 0 0 0
> >> 14 3 83 0
> >> 1 0 0 3712056 22176 60348 0 0 0 12 0 0
> >> 16 2 82 0
> >>
> >>
> >>
> >> --
> >> http://avatars.imvu.com/chris
> >
> >
>
--
http://avatars.imvu.com/chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20061203/6e27fad2/attachment-0001.html
More information about the memcached
mailing list