Equivalence of OpenID Server URLs
Nathan D. Bowen
nbowen+yadis at andtonic.com
Thu Jun 9 17:54:19 PDT 2005
I apologize if this is covered in a thread I missed, but while switching
my code to use POSTs, I just realized I was making an assumption that
should maybe be discussed explicitly.
When an OpenID server URL looks like:
http://www.livejournal.com/misc/openid.bml?ljuser_sha1=23867a7f35bd6c57...
it is very important for the consumer to preserve that query string for
identity checks, but it's problematic for the association phase.
Obviously, in the case of LiveJournal, if I have an existing server
association for
http://www.livejournal.com/misc/openid.bml
I should use it instead of re-associating for each newly-encountered
value of ljuser_sha1. So in the case of LiveJournal, I will key my
collection of association handles on [Server URL Minus Query String].
I think that's completely reasonable, I just want to make the assumption
explicit:
For any newly-ecountered Server URL, a Consumer should reuse an
existing valid association handle if the association's Server URL
matches the newly-encountered URL with query strings removed from both.
Or maybe "matches the scheme, host, port, and path of the
newly-encountered URL", if there's any benefit to reminding implementors
that scheme and port are important.
I can only think of one situation where this rule would break things,
and that's if I come up with an unlikely example of a server that needs
a query parameter to perform an association.
Maybe a server handles three distinct groups of identities and wants to
maintain separate associations for the groups. (I don't know why.. maybe
it's a server for three partner sites and they use ?partnersite=A,B,C.)
But things like ljuser_sha1 preclude us from distingushing between
servers based on _every_ query parameter, and we're not going to make a
list of which query parameters are and are not
"association-distinguishers". So I assume we can just explictly refuse
to distinguish based on any of them.
If someone finds themselves in the situation of this hypothetical
example server, they'll just have to distinguish their three groups by
scheme, host, port, or path instead of by query parameters.
An added benefit to explicitly forbdding any query strings in the
association phase is the very reason I noticed this -- we're POSTing in
that phase. There's a little gray area involved when you use GET-style
query strings in a POST request. Dropping them off completely will
sidestep that issue nicely without making consumers copy the GET
parameters into their POSTs or trust the server to combine the two types
of parameters, or whatever.
More information about the yadis
mailing list