Sxip concerns with YADIS

Sun Dec 18 19:51:48 UTC 2005

On 18-Dec-05, at 11:00 AM, Josh Hoyt wrote:

> Dick,
>
> I am glad that you have chosen to participate in this conversation.
> The more we inter-operate, the better it is for all of our users.
>
> On 12/17/05, Dick Hardt <dick at sxip.com> wrote:
>> 1) Performance
>>         - double the number of GETs for all HTML persona-urls
>
> As Johannes mentioned, there is support in the specification for
> carrying out the exchange with one GET through content negotiation. I
> expect that the most common platforms (blog software) will support
> this. It is also not difficult to support content negotiation on
> common Web servers with a little bit of configuration[1].

I see that additional pixie dust will resolve the two GETs to one.  
Additional work and config to make it happen though.

>
>>         - XML parsers take time to load and parse a file
>
> Parsing HTML also takes time, and there must be support for markup
> that is not well-formed, since a significant fraction of deployed HTML
> is ill-formed. Since the capabilities document is small and will
> likely be machine-generated, it is less likely to be ill-formed. Also,
> since this is a new format, there is no legacy of ill-formed content
> to contend with.

most blogging tools produce well formed content, but there already is  
lots of working code that deals with this. Nothing new is needed.  
Parsing HTML is MUCH faster then parsing XML still. I have had lots  
perf analysis done on this.

Also, per the spec, *ALL* Relying Parties will HAVE to be able to  
parse the HTML, since that is an option.

If a Relying Party can get its data from just HTML, then it does not  
need the XML code path.

>
> Another concern with embedding the capabilities data in HTML is that
> the content at people's identity URLs may be large. In my case, the
> content at my blog URL is 27KB while my YADIS capabilities document,
> including six service definitions, is only 1.4KB. I am sure that
> parsing a well-formed 1.4KB document will be much quicker than parsing
> a 27KB document. Although it is less significant, there is also much
> less data to transfer.

most time in parsing and data transfer are in setup, so the savings  
are not as high as one might expect
A second GET will be more expensive the additional data, and it will  
also have to be parsed, but with a different parser

>
> My arguments above do mostly depend on content-negotiation being
> common, so if you do not believe that that will be the case, then your
> objections are valid.

It will not be common at first, since it is not deployed.
The instructions on configuring Apache are nice, but out of scope for  
most bloggers.

>
>> 2) Security
>>         - the user needs control over both the pesona-url AND the
>> capabilities-url to secure their identity. Double the URLs, double
>> the risk.
>
> This is certainly true, but I think that in most cases, the
> capabilities URL will exist in the same security domain as the
> identity URL (that is, on the same server). Compromising the
> capabilities document should be just as hard as compromising the
> content of the identity URL. If they are hosted separately, then your
> objections are valid.

The good approach would be to put them in the same place. It is just  
another thing to be broken.

>
>> 3) Implementation
>>         - all major web development platforms have high  
>> performance HTML
>> parsers that present the document as a DOM. XML parsing is common,
>> but is more complex than manipulating a DOM, and another thing for
>> the developer to figure out.
>
> An XML parser can give the developer a DOM to work with as well, so I
> do not think that is a distinction between XML and HTML. It is also
> true that the structure of the capabilities document in the current
> version of the YADIS specification is simpler than HTML. Every part of
> that document is relevant to the task of describing capabilities,
> unlike HTML.

Was not implying you could not get a DOM in XML, sorry if that is  
what you understood.
My point is that people already understand how to look at the HTML  
DOM. There already is lots of microcontent in <LINK> for RSS etc.  
Existing skills and tech.

>
> I hope that there will be common libraries that implement the protocol
> so that the developer can be presented with a simple data structure
> representing the capabilities that he desires instead of having to
> write the code to parse the document himself. I admit that I'm biased
> here, because I am the author of one such library, but I think that
> quality libraries will be common.
>
>>         - getting two files requires more code, and more chances of
>> something being broken
>
> Although additional complexity is a risk, common libraries will
> greatly reduce this risk. It is still a very simple protocol.
>
>
>> SUGGESTION:
>>
>> We liked the way that OpenID worked earlier with a LINK tag in HTML:
>> [...]
>> Given that most protocols will have their own ways of describing what
>> it can do, we don't see value in a common capability file.
>
> When I read the original YADIS specification, I was very much on the
> same page. I thought YADIS was introducing unnecessary complexity into
> the simple, elegant scheme that OpenID uses for discovery. It took me
> weeks to come around and see value in introducing the further
> complexity.
>
> There are a few features that the new capabilities discovery scheme
> has that make it superior to embedding the information in HTML:
>
> 1. It is extensible. The document format that it uses allows for
> adding arbitrary information in the scope of a single service
> definition. for example:
>
>   <Service>
>     <Type>http://openid.net/signon/1.0</Type>
>     <URI>http://www.myopenid.com/server</URI>
>     <openid:Delegate>http://josh.myopenid.com/</openid:Delegate>
>   </Service>
>
>   When using HTML to express this same information, it is split
> between two <link> tags that are not obviously connected. The Type tag
> also makes extremely explicit exactly what type of service is being
> described, and makes it so that a developer can easily look up that
> protocol. The URI-based type identifier also makes it so that a new
> service can be defined without fear of namespace collisions.

I am all about extensibility. Lots of times though things are made  
extensible, but are never used.
How many "upgradable" PCs are upgraded?

My suggestion is that if the thing is simple, allow it to be in the  
HTML, don't force a parse of the XML.
If something wants the XML, the server can supply that, but don't  
force all Relying Parties to have to use the XML.

>
> 2. The format allows for fallback in the case of a service failure. If
> I have accounts on two different SXIP homesites, I can include both of
> them and express a preference for my primary homesite. If there is a
> service outage for my primary homesite, I can still use my identity,
> because services can fall back onto my secondary homesite.

in our model, the user enters the Homesite at the Relying Party
(this is a whole other discussion)
If the Homesite is down, the user can enter another one.

At the persona-url, multiple Homesites could be listed

>
> 3. The format allows for expressing preferences about which identity
> services should be used. I could include SXIP, OpenID, and LID and say
> that I prefer them to be used in that order. A relying party could
> then respect my preferences and make my user experience more
> consistent.

Good point. A Relying Party may more likely choose it's own  
preferences. :)

>
>
> In general, your criticisms are well-founded. As Johannes said, I am
> glad that you are bringing your technical and practical expertise to
> this conversation.

<blush>

> Although we may not be in complete agreement
> technically, it sounds like your goals in participating are similar to
> mine. I wrote up a description[2] of what YADIS is that does not
> include implementation details. I'd be interested to know whether this
> description matches what you want to achieve from participating in

general agreement

Summary:

Current model requires *both* HTML and XML parsing. Simpler if only  
one is needed, so make that HTML.

Allow simple declarations in HTML for simple Relying parties. Lowers  
barrier to entry.

XML can be used for extensions for Relying Parties that want to take  
advantage of it.