Vpaths performance?

Jonty jonty.wareing at gmail.com
Sat Feb 23 15:56:09 UTC 2008


We started using a similar plugin at Last.fm almost a year ago, we
have had no problems since then and the perlbal cluster handles an
enormous amount of traffic.

I wouldn't worry about this too much :-)

Jonty Wareing
--
Developer
Last.fm

On Thu, Feb 21, 2008 at 2:00 PM, Jeremy James <jbj at forbidden.co.uk> wrote:
>
> dormando wrote:
>  > Kevin Olson wrote:
>  >> Has anyone tested Vpaths on a high volume site?
>  >>
>  >> I don't see any glaring problems in the code, but the warning at the top,
>  >> "this has also not been optimized for huge volume sites" seems somewhat
>  >> ominous.
>  >>
>  >> Kevin
>  >>
>  > I think we use something similar and it performs okay.
>  >
>  > For gaiaonline I had plugin with the paths hardcoded and did direct
>  > string matching for the most possible speed. In the grand scheme of
>  > things that was probably stupid, as my entire perlbal cluster never got
>  > above 20% CPU usage during extreme peaks.
>
>  Likewise, we've been using our own matcher (Urlmatch) on moderately
>  loaded production systems since late 2006 without issue. It's slightly
>  less efficient than Vpaths when it comes to matching urls.
>
>  Given the chained selectors went in with r739, is it worth removing the
>  "does not play well with vhosts" comment at the top?
>
>  I'm not sure what one would do for a 'huge volume site' to increase
>  performance (ie. reduce regexps needed to match) - perhaps you could
>  assume that several of your regexps start with similar prefixes, then
>  parse them into a tree to short-circuit having to match them all? eg:
>
>  ##
>  # Take a particular month from dynamic (it has some auto xmas content)
>  VPATH /blog/2007/12/ = dynamic_server
>
>  # Take other monthly archives from pre-generated server(s)
>  VPATH /blog/[0-9]{4}/(0[1-9]|1[12])/ = static_server
>
>  # But take latest from cgi
>  VPATH /blog/latest = dynamic_server
>
>  VPATH /blog/ = static_server
>
>  # Other stuff
>  VPATH /wiki/.* = wiki_server
>  VPATH /about/ = static_server
>  VPATH /images/profile/.* = dynamic_server
>  VPATH /images/.* = static_server
>
>  SET default_service = dynamic_server
>
>  ##
>
>  You could imagine pulling this out into a tree of (I'm sure you get the
>  idea):
>
>  /blog/(.*) =>
>   2007/12/ = dynamic_server
>   [0-9]{4}/(0[1-9]|1[12])/ = static_server
>   latest = dynamic_server
>    = static_server
>  /wiki/.* = wiki_server
>  /about/ = static_server
>  /images/(.*) =>
>   profile/.* = dynamic_server
>   .* = static_server
>
>  Now matching something at /images/funny_cat_picture.jpg goes through 5
>  regexp matches before getting a hit, not 8 (including avoiding some more
>  complicated ones).
>
>  Would this really gain much by automating this, or are you likely to
>  gain better levels of performance and less headaches by either writing
>  your own custom plugin or chaining multiple vpaths together? (ie. use
>  VPATH /blog/.* = blog_selector).
>
>
>  However, if you have particularly complicated and processor intensive
>  regexps to match on a heavily loaded site, you're doing it wrong.
>  Probably better to send requests off to the dynamic server by default,
>  then return a X-REPROXY instead if it could have gone somewhere else.
>
>  Best wishes,
>  Jeremy
>


More information about the perlbal mailing list