URL canonicalization

Zefiro work at zefiro.de
Thu Sep 15 03:16:31 PDT 2005

Hi, Dan...

first, to avoid confusion:
Users types in 'Zefiro.de' in the login box.
claimed identity: 'Zefiro.de'
canonical identity / normalized claimed identity: 'http://zefiro.de/'
delegate identity: 'http://www.livejournal.com/users/zefirodragon/'
primary key: a unique string field in the database used to distinguish different users. Depending on the DBMS used an integer
would be the real primary key and the OpenID identity just an indexed unique field. Still this is used for login.

We were talking about claimed vs. normalized claimed (may indeed be a better name) identity. The consumer should do nothing with
the delegate identity except for using this when asking the OpenID server for authentification. Nothing includes not storing,
not using as database key, not assuming same delegates are same identitis or different delegates are different identities.

> It seems cleaner to me to use the canonical identity as primary key.
This depends on how sure you can be that two claimed identities are identical. If you assume normalized claimed identity is what
makes the difference, then you'd store this in the database. All other variants of writing it (different claimed identity
leading to same normalized id) are then accepted equally, for user convenience.

If you assume the user is capable enough of typing his id the same every time - just like he has to in all other login variants
with normal user names, and autocompletion usually does it anyway - then there is no problem in using the claimed identity.

Since I really want to be known as my claimed identity, not my normalized claimed id - cause it looks much uglier, with the http
etc - I want it to be displayed like I entered it. And then, I personally think it's cleaner to use as primary key exactly what
is displayed - i.e. what users using your site would use as their 'primary key in brain' association with me.

Another point. Redirects. If I understood this correctly, the normalized id is the endpoint of redirects. These may change. I
think the used identity - neither the displayed one nor the one which distinguishes me from others, i.e. your primary key -
shouldn't change when I change redirects. I regard redirects in some way as similar to delegates - it's a middle layer between
claimed id to display/use and what the OpenID server would confirm.

>>Claimed identity is what the user uses for login, and can choose to use, and probably has choosen on purpose.

> That way, you allow the user to enter eg: "http://sally.people.com/" the first time and then just "sally.people.com" the
second time, and they both point to the same record.
yes, but are you sure this is what the user wants? it's convenience, right. But the user could just use the same id everytime -
could have used the short version in the first place. Which they will do, once the get accustomed with OpenID.

> You could still display ( or even store ) whatever the user entered as a "pretty" identifer.
see above. I think it's cleaner to use as primary key what is actually entered and displayed, not what you internally make of it.

>>c) to allow consumers to handle cases where a canonicalisation of an known identity and the claimed identity would be equal,
>> but the noncanonical version isn't, to propose the user the known version as suggestion
> Seems like a redundant step ( from the user's viewpoint ).
Just a convenience which consumers might choose to offer. Some kind of 'smart error correction'. Wouldn't harm, is not needed.

>>d) to encourage consumers who use the OpenID identity for their own profile managemant (e.g. the first login automatically
>>creates a local user, for local settings etc) to support changing or adding multiple identities to the same profile. (of course
>>only if the users can provide a sucessful claim for both identities)
> While nice, this seems optional, and outside the scope of the spec.
Surely, it's optional. But I'd suggest every consumer coder should think about whether he wants to use it this way or not.
Therefore it would be nice if it was mentioned somewhere. Could be in the spec, the faq, the use cases. Just somewhere where
newbies can stumble upon it.


More information about the yadis mailing list