Perlbal and Last.fm

Mark Smith marksmith at danga.com
Thu Jan 27 22:13:03 PST 2005


Inline!

On Fri, Jan 28, 2005, Russ Garrett wrote:
> Hi, a week or two on, here are a couple of issues we've found with
> Perlbal (we are using the CVS, I lied when I said the tarball - I was
> thinking of memcached...):
> 
> Firstly, we seem to be getting a significant degree of memory leakage
> with perlbal - before I restarted today it was using 1.3GB of memory,
> which is never a good thing. It's running ulimited now, so we shall see
> if this improves things.

Interesting!  We don't have (much) memory leakage with Perlbal.  It's
been up for about a day or two (since the last restart) and is using
only 65MB of memory each, doing 500-600 requests per second (each)...

I'd be interested in knowing more about your usage pattern--what kind of
traffic are you sending at it, and what are you having it do?  (The most
informative would be to see your config file.)

I'd be happy to look into it and try to reproduce and fix!

> Secondly, and I don't know if this is related, we're getting some errors
> from perlbal:
> 
> epoll() returned fd 77 w/ state 1 for which we have no mapping.
> removing.
> epoll() returned fd 30 w/ state 1 for which we have no mapping.
> removing.
> Use of uninitialized value in split
> at /usr/local/share/perl/5.8.4/Net/Netmask.pm line 242.
> Use of uninitialized value in bitwise and (&)
> at /usr/local/share/perl/5.8.4/Net/Netmask.pm line 404.
> 
> As you can see we're using Perl 5.8.4 (the current version in Debian
> Sarge).

We get those two.  The undefined value errors are caused by
peer_ip_string not working, which I think is caused by clients going
away after we've started using them.  I just haven't gotten around to
sprinkling in more "return if closed" stuff.  These shouldn't really
harm anything, I think.

The no mapping errors... we also get those.  :)  But I don't really know
what's causing it.  I think it has something to do with people hanging
up before they actually request something.  (Technically, it seems to
happen if the client does a SYN and then a RST without there being a
real connection in between.)

> Lastly, and I think this is probably specific to our setup and the fact
> that PHP is shit, we sometimes get complete dropouts where the CPU usage
> on the web heads drops to almost 0, and perlbal starts queueing
> everything while it connects some more backends. Maybe our setup needs
> more tweaking

When did this start happening?  There was a bug in Perlbal for about 48
hours over the past two or three days (it was fixed today) that caused
it to mishandle backend connections, sometimes causing long queues
without there being any real traffic...

This was causing quite a mess on our servers, and was totally my fault.
:(  But it's fixed now!

If that's not it, or your CVS is at least a week old, when this problem
happens it'd be useful to see output from the following management
commands:

queues
states
show service [your proxy service name]

> How is MaxClients set up on LiveJournal? I initially tried it down as
> low as 10, but it seemed to be having trouble saturating the servers at
> that level (probably our PHP issues again). It's now up to 60, and that
> seems a lot better, however that's increasing the amount of concurrency
> on our DB servers, which isn't necessarily a good thing...

MaxClients for Apache?  I'm not sure on this one... Brad?

--
Mark Smith
junior at danga.com


More information about the perlbal mailing list