Proposal (Was: When are and aren't two URLs the same?)
Thomas Broyer
t.broyer at gmail.com
Mon Apr 24 19:15:26 UTC 2006
2006/4/24, Carl Howells <chowells at janrain.com>:
> > How about the case when there are . or .. components in the absolute path part of an URL?
> > Including when those which are trailing without a slash.
> >
> > E.g. Should http://example.org/users/joe and http://example.org/users/foo/../joe be the same?
> > What about http://alice.example.com/ vs http://alice.example.com/.
> >
>
> Nothing should be done in that case. Those pairs of URLs aren't the
> same, even is some servers will provide the same content for them. Just
> because the path section of a URL looks like a Unix path doesn't mean
> that it is or should be treated as a Unix path.
Look at the example in section 6.2.2 of RFC3986.
Actually, it would be a bug to assume those are different URIs, based
on section 6.2.2.3 of that very same RFC, reproduced here for
completeness:
6.2.2.3. Path Segment Normalization
The complete path segments "." and ".." are intended only for use
within relative references (Section 4.1) and are removed as part of
the reference resolution process (Section 5.2). However, some
deployed implementations incorrectly assume that reference resolution
is not necessary when the reference is already a URI and thus fail to
remove dot-segments when they occur in non-relative paths. URI
normalizers should remove dot-segments by applying the
remove_dot_segments algorithm to the path, as described in
Section 5.2.4.
--
Thomas Broyer
More information about the yadis
mailing list