From brad at danga.com Sat Nov 5 00:06:03 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Sat Nov 5 00:06:05 2005 Subject: Upload tracking with Perlbal Message-ID: In the latest version or two of Perlbal, Perlbal can now help you show a progress bar to users during uploads. Here's how you configure your Perlbal: # Make a service that listens for UDP progress packets. With a single # Perlbal, these packets are only from itself. Sorry, I didn't optimize # for the single host case. CREATE SERVICE uptrack SET role = upload_tracker SET listen = 127.0.0.1:7002 ENABLE uptrack # Then configure your web_proxy service to send those packets once a # second per connection (if the upload actually is requested being # tracked) CREATE SERVICE fotobilder SET role = reverse_proxy .... SET upload_status_listeners = 127.0.0.1:7002, 10.54.0.1:7001 ... Then use this JavaScript library: http://cvs.danga.com/browse.cgi/wcmtools/js/perlbal-uploadtrack.js?rev=1.2 And on your page, use JavaScript to intercept the form submit event and change the form's target to an in-page iframe and kick-off the library doing its tracking (using XmlHttpRequest), calling your callback. (If the client doesn't have javascript, your form submit goes through unchanged...) Here's an example: Here's the patch I just did for fotobilder with almost that exact same code above: http://www.livejournal.com/community/fotobilder_cvs/257401.html Enjoy! - Brad From aar at cpan.org Mon Nov 7 03:50:12 2005 From: aar at cpan.org (Alessandro Ranellucci) Date: Mon Nov 7 03:50:17 2005 Subject: SSL handshake blocks Perlbal Message-ID: Greetings, an easy way to block an instance of Perlbal running SSL is to open a telnet session to its port. The accept() method will then wait for client handshake, thus blocking the whole application. See this thread for further information: http://www.cpanforum.com/threads/433 We would need a non-blocking IO::Socket::SSL port. :-( Brad, any clue? - alessandro ranellucci. From brad at danga.com Mon Nov 7 10:23:39 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Mon Nov 7 10:23:42 2005 Subject: SSL handshake blocks Perlbal In-Reply-To: References: Message-ID: I feared something like this might happen. I have no answer except begging the IO::Socket::SSL maintainer to look at Perlbal and try to fix make IO::Socket::SSL non-blocking in that phase. :/ - Brad On Mon, 7 Nov 2005, Alessandro Ranellucci wrote: > Greetings, > > an easy way to block an instance of Perlbal running SSL is to open a > telnet session to its port. The accept() method will then wait for > client handshake, thus blocking the whole application. > > See this thread for further information: > http://www.cpanforum.com/threads/433 > > We would need a non-blocking IO::Socket::SSL port. :-( > Brad, any clue? > > - alessandro ranellucci. > > From fred at redhotpenguin.com Mon Nov 7 13:26:06 2005 From: fred at redhotpenguin.com (Fred Moyer) Date: Mon Nov 7 13:51:18 2005 Subject: socket never became readable Message-ID: Hi, I've been encountering the following error message in my application server logs on an infrequent basis: MogileFS::Backend: socket never became readable at /home/app_user/perl/lib/site_perl/5.8.6/i686-linux/MogileFS.pm line 167 Perlbal and MogileFS daemons are running on a front end proxy server, and this message occurs occassionally in the application server logs during re-proxy handling. On the client facing server there 4 perlbal daemons, 4 mogstored daemons running at idle and about 2 dozen mogilefsd daemons. Any clues on the next step in debugging this problem are appreciated, chances are I'm missing something obvious here as I am a relative newcomer to re-proxying in production. Thanks, - Fred From scrazy77 at gmail.com Thu Nov 10 19:21:44 2005 From: scrazy77 at gmail.com (Eric Chang) Date: Thu Nov 10 19:21:47 2005 Subject: Perlbal-XS-HTTPHeaders compile error Message-ID: I'm using Gentoo Linux X86-64 And xs httpheader make test not passed.. --start -- Tiffany Perlbal-XS-HTTPHeaders # perl Makefile.PL Note (probably harmless): No library found for -lstdc++ Writing Makefile for Perlbal::XS::HTTPHeaders Tiffany Perlbal-XS-HTTPHeaders # make g++ -c -I. -g -O2 -march=k8 -pipe -DVERSION=\"0.18\" -DXS_VERSION=\"0.18\" -fPIC "-I/usr/lib/perl5/5.8.6/x86_64-linux/CORE" headers.cpp g++ -c -I. -g -O2 -march=k8 -pipe -DVERSION=\"0.18\" -DXS_VERSION=\"0.18\" -fPIC "-I/usr/lib/perl5/5.8.6/x86_64-linux/CORE" HTTPHeaders.c Running Mkbootstrap for Perlbal::XS::HTTPHeaders () chmod 644 HTTPHeaders.bs rm -f blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.so x86_64-pc-linux-gnu-gcc -shared -L/usr/local/lib headers.o HTTPHeaders.o -o blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.so \ \ chmod 755 blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.so cp HTTPHeaders.bs blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.bs chmod 644 blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.bs Manifying blib/man3/Perlbal::XS::HTTPHeaders.3pm Tiffany Perlbal-XS-HTTPHeaders # make test PERL_DL_NONLAZY=1 /usr/bin/perl5.8.6 "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/HTTPHeaders....NOK 1# Failed test (t/HTTPHeaders.t at line 10) # Tried to use 'Perlbal::XS::HTTPHeaders'. # Error: Can't load '/home/wcmtools/lib/Perlbal-XS-HTTPHeaders/blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.so' for module Perlbal::XS::HTTPHeaders: /home/wcmtools/lib/Perlbal-XS-HTTPHeaders/blib/arch/auto/Perlbal/XS/HTTPHeaders/HTTPHeaders.so: undefined symbol: __gxx_personality_v0 at /usr/lib/perl5/5.8.6/x86_64-linux/DynaLoader.pm line 230. # at (eval 1) line 2 # Compilation failed in require at (eval 1) line 2. t/HTTPHeaders....ok 3/31&Perlbal::XS::HTTPHeaders::constant not defined at t/HTTPHeaders.t line 45 # Looks like you planned 31 tests but only ran 3. # Looks like your test died just after 3. t/HTTPHeaders....dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 1, 4-31 Failed 29/31 tests, 6.45% okay Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/HTTPHeaders.t 255 65280 31 57 183.87% 1 4-31 Failed 1/1 test scripts, 0.00% okay. 29/31 subtests failed, 6.45% okay. make: *** [test_dynamic] Error From dormando at rydia.net Wed Nov 16 12:27:22 2005 From: dormando at rydia.net (dormando) Date: Wed Nov 16 12:27:28 2005 Subject: Limiting backend connections per node Message-ID: <437B962A.3040804@rydia.net> Hey all, I have gotten perlbal up and running without issue, so far. However, now stands the daunting task of testing it out of and in production ;) I see perlbal relies on the backend servers MaxClients (or similar) setting to limit the number of accepting backend connections that each server has. This is an issue for me (at the moment), since 35+ of my backend webservers were set up by hand, use different linux distros, have their configs in different places, etc. I can't rebuild them for a couple more weeks, so it'd be much easier if perlbal had a node option to limit the max number of backend connections it will try to open to each independent node. It would also make it much simpler to try different maximums for the backend to see which is most efficient for throughput. Otherwise (as far as I can tell) I have to change the config and grace apache on 100+ machines, which even when done in parallel is a bit obnoxious. So uhh. Should I try hacking it in, does perlbal not work right with this kind of setup, or what? :) Thanks, -Alan From junior at danga.com Wed Nov 16 12:31:16 2005 From: junior at danga.com (Mark Smith) Date: Wed Nov 16 12:31:18 2005 Subject: Limiting backend connections per node In-Reply-To: <437B962A.3040804@rydia.net> References: <437B962A.3040804@rydia.net> Message-ID: <20051116203116.GB22817@danga.com> > I see perlbal relies on the backend servers MaxClients (or similar) > setting to limit the number of accepting backend connections that each > server has. > > This is an issue for me (at the moment), since 35+ of my backend > webservers were set up by hand, use different linux distros, have their > configs in different places, etc. I can't rebuild them for a couple more > weeks, so it'd be much easier if perlbal had a node option to limit the > max number of backend connections it will try to open to each > independent node. It would also make it much simpler to try different > maximums for the backend to see which is most efficient for throughput. > Otherwise (as far as I can tell) I have to change the config and grace > apache on 100+ machines, which even when done in parallel is a bit > obnoxious. I guess I'm a little confused as to where the problem is. Perlbal will connect to a node until it stops processing requests, yes, but this means that Perlbal will only ever have N+1 connections open to each backend, where N is the number of MaxClients. This should be ideal, assuming your MaxClients are set to sane values. If you haven't, though, and they're set really high -- then yes, you're going to lose performance on those nodes when they get above their limit. On LiveJournal, if that happens, we just fix the MaxClients for the node that is underperforming. That may be 'annoying' sure, but it's always worked for us. > So uhh. Should I try hacking it in, does perlbal not work right with > this kind of setup, or what? :) Unless I'm totally missing the boat (possible! hit me if so) then you shouldn't have to do anything. N+1 isn't going to hurt your nodes, especially since the +1 request. Oh -- and in case it's not clear: for this behavior to be, you have to enable the backend verify and use OPTION requests options ... I forget the names offhand but they're in the example configs. -- Junior (aka Mark Smith) junior@danga.com Software Engineer Six Apart / Danga Interactive From dormando at rydia.net Wed Nov 16 13:10:49 2005 From: dormando at rydia.net (dormando) Date: Wed Nov 16 13:10:55 2005 Subject: Limiting backend connections per node In-Reply-To: <20051116203116.GB22817@danga.com> References: <437B962A.3040804@rydia.net> <20051116203116.GB22817@danga.com> Message-ID: <437BA059.6080706@rydia.net> No problem, let me clarify; > >I guess I'm a little confused as to where the problem is. Perlbal will >connect to a node until it stops processing requests, yes, but this means >that Perlbal will only ever have N+1 connections open to each backend, >where N is the number of MaxClients. > > > That's exactly how I understand it works. >This should be ideal, assuming your MaxClients are set to sane values. If >you haven't, though, and they're set really high -- then yes, you're going >to lose performance on those nodes when they get above their limit. > > > They're not set to sane values :) I have 50 webservers where httpd.conf might not even be named httpd.conf right now. Adjusting the MaxClients setting across all of my webservers (at the moment) is a huge manual production. I've been here about a month and a half and am still busy cleaning it up. >On LiveJournal, if that happens, we just fix the MaxClients for the node >that is underperforming. That may be 'annoying' sure, but it's always >worked for us. > > > That's totally fine, but changing it for me is nontrivial at the moment. If I have to switch back to the old load balancer (which does no client buffering), I have to up the maxclients back to 60+ across the backend. >Oh -- and in case it's not clear: for this behavior to be, you have to >enable the backend verify and use OPTION requests options ... I forget the >names offhand but they're in the example configs. > > > Yup, I have that all set. Only thing I can't set at the moment is the persist_backend, but that will be enabled if I can switch the whole service over and enable keepalives on the backend (and that perlbal supports our backend for persistent connections; I noticed chunked encoding short-circuits that). Thanks, -Alan From dormando at rydia.net Thu Nov 17 15:17:11 2005 From: dormando at rydia.net (dormando) Date: Thu Nov 17 15:17:17 2005 Subject: Another fun success Message-ID: <437D0F77.6080401@rydia.net> Hey again, Due to a total overload on our backend and given how well testing was going, we decided to give perlbal a try in production. Several hours of manual labor later, and it's working pretty incredibly. I'm working out the finer points of optimization now, but it's up and working and our site has never loaded this fast before! I had gotten hitched up on a couple things with the management interface. Tracking how it's running is tough, but after a learning curve and some source diving I got the hang of it. Right now we're working on getting HTTP 1.0 keep-alives working on our backend. Since it's all PHP, it really wants to use chunked encoding or single-shot HTTP 1.0 responses :\ The difference in speed perlbal shows when working without persistent backend connections is huge... 50-70% performance degredation, and if there're any users in the pending queue, it's because it's trying to get/verify a backend connection for them. It looks like this is fixable on our end with another ~30 minutes of effort, but I wish I was able to hack in HTTP 1.1 support with my time :) It'd also be nice to more easily track how many keep-alive connections were reused vs connections closed. Maybe I can follow this up with some patches, but probably not given time constraints... Thanks, -Alan From brad at danga.com Thu Nov 17 15:26:24 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Thu Nov 17 15:26:26 2005 Subject: Another fun success In-Reply-To: <437D0F77.6080401@rydia.net> References: <437D0F77.6080401@rydia.net> Message-ID: Good to hear. BTW, I've been moving towards chunked support in Perlbal over the past few months, although slowly. On Thu, 17 Nov 2005, dormando wrote: > Hey again, > > Due to a total overload on our backend and given how well testing was > going, we decided to give perlbal a try in production. Several hours of > manual labor later, and it's working pretty incredibly. > > I'm working out the finer points of optimization now, but it's up and > working and our site has never loaded this fast before! I had gotten > hitched up on a couple things with the management interface. Tracking > how it's running is tough, but after a learning curve and some source > diving I got the hang of it. > > Right now we're working on getting HTTP 1.0 keep-alives working on our > backend. Since it's all PHP, it really wants to use chunked encoding or > single-shot HTTP 1.0 responses :\ The difference in speed perlbal shows > when working without persistent backend connections is huge... 50-70% > performance degredation, and if there're any users in the pending queue, > it's because it's trying to get/verify a backend connection for them. > > It looks like this is fixable on our end with another ~30 minutes of > effort, but I wish I was able to hack in HTTP 1.1 support with my time > :) It'd also be nice to more easily track how many keep-alive > connections were reused vs connections closed. Maybe I can follow this > up with some patches, but probably not given time constraints... > > Thanks, > -Alan > > From scrazy77 at gmail.com Thu Nov 17 22:07:45 2005 From: scrazy77 at gmail.com (Eric Chang) Date: Thu Nov 17 22:07:48 2005 Subject: strange perlbal problem Message-ID: I've successed install perlbal and mogilefs. Our front end is ZXTM (http://www.zeus.com) then -> 2 perlbal nodes, then ->2 web ap(mod_php)+mogilefs tracker nodes, and 2 mogstored nodes as storage. our web ap pass reproxy header back to perlbal for fetching real file from mogstored. All works fine, but we have some slow response problem, Sometimes, browser just loading and hang , client should wait about 10 or more seconds.. after that the page display correctly. So.. 1. I test the web ap node, connect directly, no hang and slow problem. 2.I change ZXTM setting bypass perlbal node and go to web ap node directly, -> no hang and slow problem,too. I've no idea how to solve this.. My setup: Gentoo Linux 2.6 perl 5.8.6 Dual Opteron 2g Perlabl 1.39 Danga-Socket 1.48 Mogilefs (cvs) ---------perlbal.conf-------- LOAD vhosts xs enable headers SERVER aio_mode = linux SERVER aio_threads = 36 # apache pools CREATE POOL apaches POOL apaches ADD 10.0.0.31:80 POOL apaches ADD 10.0.0.32:80 # *.blog.foo.com CREATE SERVICE blog >---SET listen = 0.0.0.0:81 >---SET role = reverse_proxy >---SET pool = apaches >---SET persist_client = on >---SET persist_backend = on >---SET verify_backend = on SET always_trusted = on SET enable_reproxy = on SET buffer_uploads = on SET buffer_backend_connect = 128k ENABLE blog CREATE SERVICE maintains SET listen = 0.0.0.0:82 SET role = web_server SET docroot = /home/system SET dirindexing = 1 SET persist_client = on ENABLE maintains # *.* CREATE SERVICE vhost- SET listen = 0.0.0.0:80 SET role = selector SET plugins = vhosts SET persist_client = on VHOST *.blog.foo.com = blog VHOST moblog.foo.com = blog ENABLE vhost # always good to keep an internal management port open: CREATE SERVICE mgmt SET role = management SET listen = 127.0.0.1:60000 ENABLE mgmt From scrazy77 at gmail.com Fri Nov 18 06:12:32 2005 From: scrazy77 at gmail.com (Eric Chang) Date: Fri Nov 18 06:12:36 2005 Subject: strange perlbal problem Message-ID: sorry.. That's mod_php's fault,not perlbal... apache+mod_php broken sometimes.. " exit signal Segmentation fault (11)..." From brad at danga.com Fri Nov 18 09:24:53 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Fri Nov 18 09:24:55 2005 Subject: strange perlbal problem In-Reply-To: References: Message-ID: Phew. Always good to know it's not my fault. :-) On Fri, 18 Nov 2005, Eric Chang wrote: > sorry.. > That's mod_php's fault,not perlbal... > apache+mod_php broken sometimes.. > " exit signal Segmentation fault (11)..." > > From dormando at rydia.net Fri Nov 18 09:36:45 2005 From: dormando at rydia.net (Dormando) Date: Fri Nov 18 09:37:21 2005 Subject: Bizarre perlbal problem Message-ID: <437E112D.60803@rydia.net> (sorry, couldn't resist :P) We got HTTP 1.0 keepalives working (for the most part) late in the day yesterday. Overnight we witnessed a pretty bad glitch where our users would randomly get other user's site cookies and become logged in as someone else. It happened in a small percentage of users, but turning off the backend keepalives seems to have removed the issue. We're still investigating on our end, but I'm having a hard time even speculating how that happened; was perlbal sending back responses from clients other than the requestor? Any insight? Other things that came to mind: - It's possible sometimes we return an improper content-length. - If a client connection closes before reading any data back from its connection, does perlbal always junk the backend request before reusing it? I do have a weird routing setup in order to have two perlbal processes running on the same IP address, but since that's just on the frontend and only for incoming requests, I can't see how the responses would get switched... If they were, it'd happen a lot more often than with what we saw. Thanks, -Alan From brad at danga.com Fri Nov 18 09:55:00 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Fri Nov 18 09:55:01 2005 Subject: Bizarre perlbal problem In-Reply-To: <437E112D.60803@rydia.net> References: <437E112D.60803@rydia.net> Message-ID: Does your backend's session management assuming that for a given connection, the same user will always be on that connection? What backend do you use, btw? If so, that's your problem, since Perlbal mixes up requests and connections. (on purpose... it's a huge optimization) - Brad On Fri, 18 Nov 2005, Dormando wrote: > (sorry, couldn't resist :P) > > We got HTTP 1.0 keepalives working (for the most part) late in the day > yesterday. Overnight we witnessed a pretty bad glitch where our users > would randomly get other user's site cookies and become logged in as > someone else. > > It happened in a small percentage of users, but turning off the backend > keepalives seems to have removed the issue. We're still investigating on > our end, but I'm having a hard time even speculating how that happened; > was perlbal sending back responses from clients other than the > requestor? Any insight? > > Other things that came to mind: > - It's possible sometimes we return an improper content-length. > - If a client connection closes before reading any data back from its > connection, does perlbal always junk the backend request before reusing it? > > I do have a weird routing setup in order to have two perlbal processes > running on the same IP address, but since that's just on the frontend > and only for incoming requests, I can't see how the responses would get > switched... If they were, it'd happen a lot more often than with what we > saw. > > Thanks, > -Alan > > From dormando at rydia.net Fri Nov 18 10:00:37 2005 From: dormando at rydia.net (Dormando) Date: Fri Nov 18 10:01:05 2005 Subject: Bizarre perlbal problem In-Reply-To: References: <437E112D.60803@rydia.net> Message-ID: <437E16C5.6000108@rydia.net> I don't think so... Our backend is just direct connections to PHP running behind either apache1 or apache2. As long as the remote user's independent requests go down and come back from the same server it should work. IE; If user A sends a request down, it just needs to get the reply to that request back. If user A sends a request down, but gets the response headers from user B, we'd get what we're seeing now. There's no reliance on the same user hitting the same exact backend connection for multiple requests. Before we put the LB in there was no keepalive at all, and users would hit one of 80 servers at random. -Alan Brad Fitzpatrick wrote: > Does your backend's session management assuming that for a given > connection, the same user will always be on that connection? What backend > do you use, btw? > > If so, that's your problem, since Perlbal mixes up requests and > connections. (on purpose... it's a huge optimization) > > - Brad > > > On Fri, 18 Nov 2005, Dormando wrote: > > >>(sorry, couldn't resist :P) >> >>We got HTTP 1.0 keepalives working (for the most part) late in the day >>yesterday. Overnight we witnessed a pretty bad glitch where our users >>would randomly get other user's site cookies and become logged in as >>someone else. >> >>It happened in a small percentage of users, but turning off the backend >>keepalives seems to have removed the issue. We're still investigating on >>our end, but I'm having a hard time even speculating how that happened; >>was perlbal sending back responses from clients other than the >>requestor? Any insight? >> >>Other things that came to mind: >> - It's possible sometimes we return an improper content-length. >> - If a client connection closes before reading any data back from its >>connection, does perlbal always junk the backend request before reusing it? >> >>I do have a weird routing setup in order to have two perlbal processes >>running on the same IP address, but since that's just on the frontend >>and only for incoming requests, I can't see how the responses would get >>switched... If they were, it'd happen a lot more often than with what we >>saw. >> >>Thanks, >>-Alan >> >> From brad at danga.com Fri Nov 18 10:35:51 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Fri Nov 18 10:35:54 2005 Subject: Bizarre perlbal problem In-Reply-To: <437E16C5.6000108@rydia.net> References: <437E112D.60803@rydia.net> <437E16C5.6000108@rydia.net> Message-ID: I hear your concern, but I'm not worried about Perlbal mixing up requests and responses. However, what I'm asking about is different: not that each request can go to different servers, but that PHP isn't caching any session info data on the CONNECTION object, not the REQUEST object. So you got a TCP connection with a bunch of HTTP requests inside it: If the PHP is making the assumption that each http request is from the same user (which is usually the case!) and caching on the TCP connection object, then Perlbal could be confusing it, because Perlbal's persistent connections to PHP aren't aligned with users at all. On Fri, 18 Nov 2005, Dormando wrote: > I don't think so... > > Our backend is just direct connections to PHP running behind either > apache1 or apache2. As long as the remote user's independent requests go > down and come back from the same server it should work. IE; If user A > sends a request down, it just needs to get the reply to that request > back. If user A sends a request down, but gets the response headers from > user B, we'd get what we're seeing now. > > There's no reliance on the same user hitting the same exact backend > connection for multiple requests. Before we put the LB in there was no > keepalive at all, and users would hit one of 80 servers at random. > > -Alan > > Brad Fitzpatrick wrote: > > Does your backend's session management assuming that for a given > > connection, the same user will always be on that connection? What backend > > do you use, btw? > > > > If so, that's your problem, since Perlbal mixes up requests and > > connections. (on purpose... it's a huge optimization) > > > > - Brad > > > > > > On Fri, 18 Nov 2005, Dormando wrote: > > > > > >>(sorry, couldn't resist :P) > >> > >>We got HTTP 1.0 keepalives working (for the most part) late in the day > >>yesterday. Overnight we witnessed a pretty bad glitch where our users > >>would randomly get other user's site cookies and become logged in as > >>someone else. > >> > >>It happened in a small percentage of users, but turning off the backend > >>keepalives seems to have removed the issue. We're still investigating on > >>our end, but I'm having a hard time even speculating how that happened; > >>was perlbal sending back responses from clients other than the > >>requestor? Any insight? > >> > >>Other things that came to mind: > >> - It's possible sometimes we return an improper content-length. > >> - If a client connection closes before reading any data back from its > >>connection, does perlbal always junk the backend request before reusing it? > >> > >>I do have a weird routing setup in order to have two perlbal processes > >>running on the same IP address, but since that's just on the frontend > >>and only for incoming requests, I can't see how the responses would get > >>switched... If they were, it'd happen a lot more often than with what we > >>saw. > >> > >>Thanks, > >>-Alan > >> > >> > > From dormando at rydia.net Fri Nov 18 11:00:15 2005 From: dormando at rydia.net (Dormando) Date: Fri Nov 18 11:00:45 2005 Subject: Bizarre perlbal problem In-Reply-To: References: <437E112D.60803@rydia.net> <437E16C5.6000108@rydia.net> Message-ID: <437E24BF.8080200@rydia.net> Ok, all I had was speculation. This sounds more grounding. Now, you might know that I'm not a PHP haxx0r. You might say I barely know the first thing about the language; The fact that PHP seems to mix up some stuff per connection instead of per request was news to me. Our backend is also running a dozen different versions of PHP at the moment (which I'm working on fixing!), so it's possible there's a glitch or two involved here. I can reason the glitch happening sometimes and being related to PHP persisting some data either because of a glitch in an old version of PHP or because of a code bug in a backend I've never looked at myself. It doesn't feel like a widespread glitch though, given how few of our users are hit by it. If the first request to a keep-alive connection had all the login data for the rest of the requests, every one of our users would be logged out within a couple minutes. I'll go back to working with one of the programmers in hunting it down, but for the record; are there any other reasons why this might happen? Nothing related to connections blowing up or dying or getting confused? What happens when a bad content-length is fed back to perlbal on accident? (yes, we're checking X-Forwarded-For and such, I made sure of that at least). Thanks, -Alan Brad Fitzpatrick wrote: > I hear your concern, but I'm not worried about Perlbal mixing up requests > and responses. > > However, what I'm asking about is different: not that each request can go > to different servers, but that PHP isn't caching any session info data on > the CONNECTION object, not the REQUEST object. > > So you got a TCP connection with a bunch of HTTP requests inside it: > > > > > > > > > > If the PHP is making the assumption that each http request is from the > same user (which is usually the case!) and caching on the TCP connection > object, then Perlbal could be confusing it, because Perlbal's persistent > connections to PHP aren't aligned with users at all. > > > > On Fri, 18 Nov 2005, Dormando wrote: > > >>I don't think so... >> >>Our backend is just direct connections to PHP running behind either >>apache1 or apache2. As long as the remote user's independent requests go >>down and come back from the same server it should work. IE; If user A >>sends a request down, it just needs to get the reply to that request >>back. If user A sends a request down, but gets the response headers from >>user B, we'd get what we're seeing now. >> >>There's no reliance on the same user hitting the same exact backend >>connection for multiple requests. Before we put the LB in there was no >>keepalive at all, and users would hit one of 80 servers at random. >> >>-Alan >> >>Brad Fitzpatrick wrote: >> >>>Does your backend's session management assuming that for a given >>>connection, the same user will always be on that connection? What backend >>>do you use, btw? >>> >>>If so, that's your problem, since Perlbal mixes up requests and >>>connections. (on purpose... it's a huge optimization) >>> >>>- Brad >>> >>> >>>On Fri, 18 Nov 2005, Dormando wrote: >>> >>> >>> >>>>(sorry, couldn't resist :P) >>>> >>>>We got HTTP 1.0 keepalives working (for the most part) late in the day >>>>yesterday. Overnight we witnessed a pretty bad glitch where our users >>>>would randomly get other user's site cookies and become logged in as >>>>someone else. >>>> >>>>It happened in a small percentage of users, but turning off the backend >>>>keepalives seems to have removed the issue. We're still investigating on >>>>our end, but I'm having a hard time even speculating how that happened; >>>>was perlbal sending back responses from clients other than the >>>>requestor? Any insight? >>>> >>>>Other things that came to mind: >>>> - It's possible sometimes we return an improper content-length. >>>> - If a client connection closes before reading any data back from its >>>>connection, does perlbal always junk the backend request before reusing it? >>>> >>>>I do have a weird routing setup in order to have two perlbal processes >>>>running on the same IP address, but since that's just on the frontend >>>>and only for incoming requests, I can't see how the responses would get >>>>switched... If they were, it'd happen a lot more often than with what we >>>>saw. >>>> >>>>Thanks, >>>>-Alan >>>> >>>> >> >> From brad at danga.com Fri Nov 18 11:07:00 2005 From: brad at danga.com (Brad Fitzpatrick) Date: Fri Nov 18 11:07:03 2005 Subject: Bizarre perlbal problem In-Reply-To: <437E24BF.8080200@rydia.net> References: <437E112D.60803@rydia.net> <437E16C5.6000108@rydia.net> <437E24BF.8080200@rydia.net> Message-ID: Perlbal closes connections whenever it gets the slightly hint of somebody being confused. This was less true in the past, but the past 6 months or so of development has been adding paranoia/robustness at the cost of slight performance in some areas. On Fri, 18 Nov 2005, Dormando wrote: > Ok, all I had was speculation. This sounds more grounding. > > Now, you might know that I'm not a PHP haxx0r. You might say I barely > know the first thing about the language; The fact that PHP seems to mix > up some stuff per connection instead of per request was news to me. Our > backend is also running a dozen different versions of PHP at the moment > (which I'm working on fixing!), so it's possible there's a glitch or two > involved here. > > I can reason the glitch happening sometimes and being related to PHP > persisting some data either because of a glitch in an old version of PHP > or because of a code bug in a backend I've never looked at myself. It > doesn't feel like a widespread glitch though, given how few of our users > are hit by it. If the first request to a keep-alive connection had all > the login data for the rest of the requests, every one of our users > would be logged out within a couple minutes. > > I'll go back to working with one of the programmers in hunting it down, > but for the record; are there any other reasons why this might happen? > Nothing related to connections blowing up or dying or getting confused? > What happens when a bad content-length is fed back to perlbal on accident? > > (yes, we're checking X-Forwarded-For and such, I made sure of that at > least). > > Thanks, > -Alan > > Brad Fitzpatrick wrote: > > I hear your concern, but I'm not worried about Perlbal mixing up requests > > and responses. > > > > However, what I'm asking about is different: not that each request can go > > to different servers, but that PHP isn't caching any session info data on > > the CONNECTION object, not the REQUEST object. > > > > So you got a TCP connection with a bunch of HTTP requests inside it: > > > > > > > > > > > > > > > > > > > > If the PHP is making the assumption that each http request is from the > > same user (which is usually the case!) and caching on the TCP connection > > object, then Perlbal could be confusing it, because Perlbal's persistent > > connections to PHP aren't aligned with users at all. > > > > > > > > On Fri, 18 Nov 2005, Dormando wrote: > > > > > >>I don't think so... > >> > >>Our backend is just direct connections to PHP running behind either > >>apache1 or apache2. As long as the remote user's independent requests go > >>down and come back from the same server it should work. IE; If user A > >>sends a request down, it just needs to get the reply to that request > >>back. If user A sends a request down, but gets the response headers from > >>user B, we'd get what we're seeing now. > >> > >>There's no reliance on the same user hitting the same exact backend > >>connection for multiple requests. Before we put the LB in there was no > >>keepalive at all, and users would hit one of 80 servers at random. > >> > >>-Alan > >> > >>Brad Fitzpatrick wrote: > >> > >>>Does your backend's session management assuming that for a given > >>>connection, the same user will always be on that connection? What backend > >>>do you use, btw? > >>> > >>>If so, that's your problem, since Perlbal mixes up requests and > >>>connections. (on purpose... it's a huge optimization) > >>> > >>>- Brad > >>> > >>> > >>>On Fri, 18 Nov 2005, Dormando wrote: > >>> > >>> > >>> > >>>>(sorry, couldn't resist :P) > >>>> > >>>>We got HTTP 1.0 keepalives working (for the most part) late in the day > >>>>yesterday. Overnight we witnessed a pretty bad glitch where our users > >>>>would randomly get other user's site cookies and become logged in as > >>>>someone else. > >>>> > >>>>It happened in a small percentage of users, but turning off the backend > >>>>keepalives seems to have removed the issue. We're still investigating on > >>>>our end, but I'm having a hard time even speculating how that happened; > >>>>was perlbal sending back responses from clients other than the > >>>>requestor? Any insight? > >>>> > >>>>Other things that came to mind: > >>>> - It's possible sometimes we return an improper content-length. > >>>> - If a client connection closes before reading any data back from its > >>>>connection, does perlbal always junk the backend request before reusing it? > >>>> > >>>>I do have a weird routing setup in order to have two perlbal processes > >>>>running on the same IP address, but since that's just on the frontend > >>>>and only for incoming requests, I can't see how the responses would get > >>>>switched... If they were, it'd happen a lot more often than with what we > >>>>saw. > >>>> > >>>>Thanks, > >>>>-Alan > >>>> > >>>> > >> > >> > > From dormando at rydia.net Fri Nov 18 20:05:59 2005 From: dormando at rydia.net (Dormando) Date: Fri Nov 18 20:06:34 2005 Subject: Bizarre perlbal problem In-Reply-To: <437E3C75.4030506@powertrip.co.za> References: <437E112D.60803@rydia.net> <437E16C5.6000108@rydia.net> <437E3C75.4030506@powertrip.co.za> Message-ID: <437EA4A7.1020500@rydia.net> Jacques Marneweck wrote: > Brad Fitzpatrick wrote: > >>I hear your concern, but I'm not worried about Perlbal mixing up requests >>and responses. >> >>However, what I'm asking about is different: not that each request can go >>to different servers, but that PHP isn't caching any session info data on >>the CONNECTION object, not the REQUEST object. >> > > I'll do a little digging with your theory ;) > > With PHP everything is stateless from my experience with PHP over the > past 8 years. Each request is treated in a stateless manner but one can > get session data based on the session cookie / session identifier > specified as ?mysession=sessionid style of URL's or both depending on > the scenario. > > I'm currently serving doc.php.net up from two servers with one perlbal > instance running on one of the boxes without any issues atm. I' haven't > upgraded to 1.39 yet. I'll most likely get round to doing the upgrade > this weekend. > > Regards > --jm We couldn't find anything in our codebase that uses variables which persist between requests on the same connection. We had a hard time figuring out what can persist for the whole connection at all... We spent some time opening a Keep-Alive connection to a test apache server, then sending it requests with different user login cookies (or none at all) each time. We weren't able to convince two versions of PHP to send us back the wrong cookies. Further, in any scenario we could think of our users would be getting their logins swapped instantly, not as small of a percentage as we were seeing. If a single keep-alive conncetion can service up to 500 requests in our case, there'd be a lot of room for something to get poisoned and a lot of requests to get poisoned. At this point we're leaning toward the idea that one or two of our ancient webservers is running an ancient broken PHP install that returns bunkus data occasionally, and that the Content-Length injection the devs had tried wasn't 100% perfect either. In a couple weeks our webserver backend will be an array of shiney new debian servers (new OS, anyway). We'll try it again then and send some updates. have fun, -Alan