URL canonicalization
Dan Libby
danda at videntity.org
Wed Sep 14 15:32:35 PDT 2005
Michael 'hacker' Krelin wrote:
>On Wed, Sep 14, 2005 at 03:53:28PM -0600, Dan Libby wrote:
>
>
>I'm afraid you're overdoing it. You should not lowercase anything past
>hostname.
>
>
You're right. I mistakenly thought the OpenID library was lowercasing.
At present, its not even lowercasing hostname.
>>"https://sally.people.com/" be treated as a separate identity from
>>"http://sally.people.com/"? Or should the protocol be ignored?
>>
>>
>
>Clearly https://sally.people.com/ can have content different from
>http://sally.people.com/ so you can't just ignore it.
>
>
Fair enough. Makes things simpler anyway.
Still, I wonder about things like %20 and + in the URL. They can mean
the same character when unescaped.
Or a URL like "http://people.com/users//sally", where the "//" should be
converted to a single '/' by the webserver.
So that seems like 2 points for the spec to address:
1) lower-casing of domain names
2) normalization of query string
Any others?
Dan Libby
More information about the yadis
mailing list