Securing HTML vs securing HTTP
Jens Alfke
jens at mooseyard.com
Mon Jan 23 16:11:44 UTC 2006
On 23 Jan '06, at 12:23 AM, Martin Atkins wrote:
> If you're letting arbitrary people inject arbitrary HTML into your
> site
> you're already in danger.
Most of my examples didn't involve arbitrary people, rather the
authors of plug-in software I install in my website. Drupal
extensions, WordPress plugins, Typo themes, etc. And of course
installing software that runs on your web server always implies a
degree of trust; my point was that with an HTML-based identity system
such as OpenID, that trust now extends to your global identity as well.
So the list of bad things a plug-in could do now includes identity
theft, which is an order of magnitude nastier than simply defacing or
erasing my website.
Admittedly on the far-fetched side, but I am trying to be paranoid
here, which on past evidence isn't a bad idea.
> If you're displaying any user-supplied content on your site you
> need to
> be running it through an HTML cleaner.
That may not help, frankly. It depends on how good the software (on
someone else's machine that I don't control) that detects the <link>
tags is. Extrapolating from other <link>-scraping software, it may
not be very good. For example, take the Pingback spec <http://
www.hixie.ch/specs/pingback/pingback>, which looks for a <link
rel="pingback"...> tag. The spec explicitly tells implementations
that they don't have to use a real HTML parser, but SHOULD instead:
> search the entity body for the first match of the following regular
> expression:
>
> <link rel="pingback" href="([^"]+)" ?/?>
Bad news for CMSs that wrap displayed content in CDATA entities
rather than escaping every metacharacter -- the above regexp will
find a pingback in that CDATA, allowing a user of the site to inject
pingback tags into a page.
Of course in the case of Pingback the damage that could be done is
minimal. Not so for OpenID. I haven't looked into the source code of
the various OpenID client implementations; are they smart enough to
recognize only real <link> tags, not CDATA content?
--Jens
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/yadis/attachments/20060123/54099ec6/attachment.htm
More information about the yadis
mailing list