proposal for capabilities lookup
Michael Graves
groupmg at gmail.com
Sat Nov 19 10:13:14 PST 2005
Ernst Johannes <jernst+lists.danga.com <at> netmesh.us> writes:
>
> I'm not entirely following what you are proposing. Comments in-line.
>
> > I think two goals are most important for adoption:
> > * Those using their own site as an identity URL must be able to add
> > YADIS support without writing any code or greatly disturbing their
> > existing site. (The HTML indirection case.) This implies a regular
> > GET,
> > as you mentioned.
>
> So we agree on this one.
>
> > * In *that same request* consumers must somehow indicate that it's
> > really capabilities that they are interested in, so that larger
> > identity
> > hosts like LiveJournal can remove the layer of indirection and just
> > return the capabilities directly, without having to generate a rather
> > costly journal page that's just going to be discarded. This is the
> > purpose I was bending "Accept" for.
>
> That's why I was talking about the "local convention". Or do you mean
> a global convention?
>
> The idea of returning a new URL from which the capabilities can be
> retrieved is modeled directly after XML: XML files also typically
> specify their schema through a separate URL, rather than in-lining it
> (which they could, but which is undesirable for a variety of reasons).
>
> This compromise is usually acceptable because DTDs change very
> slowly, while XML files change relatively quickly: which is why XML
> processors all have this funny local cache of retrieved DTDs.
>
> > It seems to me that the path of least disruption is just to use a
> > custom
> > request header in place of Accept in my proposal. However, this has
> > the
> > drawback that it's nonstandard and thus proxy servers are less
> > likely to
> > be able to cope with its presence causing a different response,
> > even if
> > it appears in the "Vary" header field.
>
> Note that my proposal does not require such a custom request header,
> it only uses a non-standard response header.
>
> > I seem to remember that a very similar conclusion was reached when we
> > were discussing this for OpenID, which is why OpenID only supports the
> > HTML indirection case despite its inherent inefficiency.
>
> The OpenID case is a little different because the URL points to the
> OpenID identity server, rather than a capabilities lookup (which does
> not exist in OpenID). On another level, of course there is a parallel
A couple comments, Johannes, and not necessarily aimed at you, but to the
board here in general.
1. As someone working on plans to support YADIS/OpenID/LID in some of our core
infrastructure - ping servers, trackback engines, reputation systems,
tagging/metadata frameworks, etc. - the concern here for us and service
providers like us isn't *bandwidth* per se. CPU cycles are important to
conserve, but the down side of multiple fetches is really tied to *time*
delays, rather than the bandwidth an additional step would consume. Not saying
bandwidth shouldn't be a concern -- hold on, yes I am. Bandwidth is something
we should happily trade off for better adoption and flexibility.
Identity systems do not fail because they are bandwidth hogs.
2. One of the profound advantages of OpenID is what I've come to call
the "ISO" -- "in spite of" -- factor. OpenID has a high "in spite of"
advantage; users can boot strap their own URLs with little or no help from the
service provider of the actual URL. As long as I have control over the HTML
that is being served, I don't need anything from my service provider to play,
just a way to edit the <head> section of my HTML page.
This is such a small, humble-looking feature, that it's *very* easy to
overlook or underweight its importance.
3. Johannes and I have discussed this, but I think it's worth pointing out. A
CGI call is impossible (with standard configurations) to serve with a static
file. If my request looks like this:
http://idserver.net/mgraves?meta=capabilities
I'm stuck if I don't have control over my Apache environment. Even then,
there's a mod that must be bolted on to Apache to support the semantics of the
CGI. Now if my request is:
http://idserver.net/mgraves/yadis.xml
I've now got the ability as a user to "bootstrap" myself, by starting up
notepad, and configuring the file as needed, and saving it to
http://idserver.net/mgraves/
The interesting thing here is that orienting things toward static file
semantics ( GET "yadis.xml" vs. CGI call) does *not* hinder the service
provider from serving static file requests with dynamic responses. In other
words, this URL:
http://idserver.net/mgraves/yadis.xml
is *ostensibly* a static file on the server, but with straightforward changes
to the Apache configuration, this same URL can be served from a database or
other resource, rather than just feeding back a static file. "yadis.xml" may
not exist as a discrete file on any user's home directory for a given service
provider. It may just be synthesized on demand and spit back as the
appropriate file by Apache.
The bottom line here is that URL semantics are not symmetric. Orienting a
specification around CGI semantics generally *precludes* users from
bootstrapping themselves without the help, knowledge, or permission of the
host/webserver. Orienting a specification around static file semantics
(Get "yadis.xml") does *not* preclude service providers from automating and
virtualizing the serving of "yadis.xml" or any other "static" request.
It is for this reason that I have urged that wherever possible, static file
semantics be embraced in HTTP requests, as it lets end users work things out
for themselves independently when needed -- a huge advantage in gaining
adoption for the spec.
(I realize this can't be extended completely. You can ask for a complete VCard
for example "http://idserver.net/mgraves/xvcard.xml", and that can be served
with a user-edited static file, but that's the simplest case. In the real
world, the "view" of a VCard is predicated on who's asking. When you need to
know the identity/credential of the asking party to determine the proper
response, a simple static file name won't suffice as a request. So the
question isn't whether parameterized URLs will be necessary -- they will --
but whether they will be necessary even for a core minimum functionality. I'm
currently thinking that a minimal setup would *need* parameterized URLs, which
would enable the adoption-friendly "notepad principle" to help this thing
spread.)
4. There's some interesting suggestions here about the "ideal" approach to
this problem -- HTTP OPTIONS, etc. Anything that is going to require
extensions to conventional behavior of existing web servers is kryponite for
adoption, however. There are lots of priorities represented on this list, but
for my part, and for the direction I'd like to see the infrastructure evolve,
the priority is just one thing: adoption. Anything that slows adoption
(changes to Apache) should be included grudgingly if at all. "The great is the
enemy of the good" comes to mind here.
5. URL squatting. While Microsoft did its typical blunder in the "Favicon"
introduction (unregistrered MIME types, etc.), I don't think that the Wiki is
correct in saying that favicon.ico is an example of what the W3 folks complain
about as "URI squatting". Also, I don't think I'd even agree that it works
against the architecture principles of the Web. If it does, then the
conventional use of "index.html" does too, and it seems we've learned to live
with that. :-)
If you're OK with "index.html" being a *convention* for serving HTTP requests
on a directory, then I can't see any problem with "yadis.xml" becoming an
analogous convention to "index.html". Again, this is straightforward advocacy
of "convention over configuration", but I think that in this case, it's
crucial for minimizing the amount of accomodation by users and hosts to
achieve maximum adoption in the shortest possible timeframe.
6. Something to consider: Maybe Brad's right and the HTML <head> section is
all that should ever be changed to configure a URL for being used as an
OpenID/YADIS-enabled URL. What if we changed the architecture from one that
relied on the identity URL to answer a bunch of questions, and simply asked
the identity URL to do one thing: tell us your authoritative identity server.
If we did this, a given URL would only point to the identity server, and
rather than querying the identity URL for its capabilities, we would simply
ask the identity server for anything else we needed. (I know Brad will
probably read this and think "Duh! This is what I've been saying all
along!"). Instead of asking for this:
http://superhost.com/mgraves/yadis.xml
The identity consumer would determine the delegated ID server for the URL (a
la OpenID), and ask that server: "What are the capabilities for
http://superhost.com/mgraves ?"
Once that indirection is achieved, the game is completely changed in terms of
the web server semantics. If we simply scrape some HTML to determine the
delegated ID server for the URL, and carry out everything else with the ID
server, we have a) minimal (maybe zero) impact on the hosting provider for the
Identity URL, and maximum flexibility in terms of features for the ID server.
The ID server can be written anyway that's necessary. It's new machinery, and
has to be developed and deployed somehow anyway. The Host server can't be
required to change, or adoption will be severely constrained. Right now I'm
thinking if we just bow to Brad's (and David Recordon's) original instincts,
and make the Identity URL simply a lightweight pointer to a powerful Identity
server (based on open standards defined here), then we escape a lot of the
boxes were are trapped in in this discussion currently.
Sorry for such a long rambling post. I've been reading along for a while, but
just don't have time to chip in here for the most part.
-Michael Graves
More information about the yadis
mailing list