A start towards Net::OpenID::UserProfile

Benjamin Trott ben at sixapart.com
Wed May 25 23:17:20 PDT 2005


Agreed, there's no point in parsing the same document twice.

You can use WWW::Blog::Metadata->extract_from_html and hand it a ref to a
scalar containing the HTML to parse, and then you could let it handle
parsing the HTML, calling the plugins, etc. It only makes one pass over the
HTML.

Agreed re: trusted URLs, but you could probably do that with an on_finished
callback with a very low order (like 0 or 1), and remove any members that
were discovered during the parsing step that have untrusted URLs. Not great,
but...

Ben


On 5/25/05 11:12 PM, "Brad Fitzpatrick" <brad at danga.com> wrote:

> The silly thing, though, is that it crawls a document that was already
> crawled in creating the VerifiedIdentity object.  Now, if we could hand
> WWW::Blog::Metadata the URL and <head> section as a scalar, that'd be more
> interesting.
> 
> But we also need a way to tell it that only URLs under the root one can be
> trusted when making the profile.
> 
> - Brad
> 
> 
> On Wed, 25 May 2005, Benjamin Trott wrote:
> 
>> Hi,
>> 
>> I decided to take my own advice and play around with WWW::Blog::Metadata, to
>> see how it could be the basis of a Net::OpenID::UserProfile module. It's
>> already plugin-based & knows how to get a bunch of information about a URI
>> (doesn't have to a blog, despite the name).
>> 
>> What I came up with is below. It's a plugin called
>> WWW::Blog::Metadata::Profile, and when put in place, it will look in the
>> FOAF file or a feed (RSS or Atom) to find profile information. Sample usage
>> is the same as with any other WWW::Blog::Metadata usage:
>> 
>>     #!/usr/bin/perl -w
>>     use strict;
>> 
>>     use WWW::Blog::Metadata;
>>     use Data::Dumper;
>> 
>>     my $uri = 'http://btrott.typepad.com/';
>>     my $meta = WWW::Blog::Metadata->extract_from_uri($uri)
>>         or die WWW::Blog::Metadata->errstr;
>>     print Dumper $meta;
>> 
>> With both ::Profile and WWW::Blog::Metadata::Icon (looks in a bunch of
>> places for an image for the URI) installed, this outputs:
>> 
>> $VAR1 = bless( {
>>                  'feeds' => [
>>                               'http://btrott.typepad.com/typepad/atom.xml',
>>                               'http://btrott.typepad.com/typepad/index.rdf'
>>                             ],
>>                  'weblog_title' => 'Stupidfool.org',
>>                  'name' => 'Benjamin Trott',
>>                  'foaf_icon_uri' =>
>> 'http://btrott.typepad.com/ben-vienna.jpg',
>>                  'generator' => 'http://www.typepad.com/',
>>                  'favicon_uri' => 'http://btrott.typepad.com/favicon.ico',
>>                  'foaf_uri' => 'http://btrott.typepad.com/foaf.rdf',
>>                  'weblog_url' => 'http://btrott.typepad.com/typepad/',
>>                  'icon_uri' => 'http://btrott.typepad.com/ben-vienna.jpg'
>>                }, 'WWW::Blog::Metadata' );
>> 
>> Module below.
>> 
>> Ben
>> 
>> # $Id$
>> 
>> package WWW::Blog::Metadata::Profile;
>> use strict;
>> 
>> our $VERSION = '0.01';
>> 
>> use XML::FOAF;
>> use XML::Feed;
>> use URI::Fetch;
>> 
>> WWW::Blog::Metadata->mk_accessors(qw( name email weblog_url weblog_title ));
>> 
>> sub _fetch {
>>     my $res = URI::Fetch->fetch($_[0]) or return;
>>     \$res->content;
>> }
>> 
>> sub on_finished {
>>     my $class = shift;
>>     my($meta) = @_;
>> 
>>     if (my $uri = $meta->foaf_uri) {
>>         $class->_profile_foaf($meta, $uri);
>>     } elsif (my $feeds = $meta->feeds) {
>>         $class->_profile_feed($meta, $feeds->[0]);
>>     }
>> }
>> sub on_finished_order { 99 }
>> 
>> sub _profile_foaf {
>>     my $class = shift;
>>     my($meta, $uri) = @_;
>>     my $xml = _fetch $uri or return;
>>     my $foaf = XML::FOAF->new($xml, $uri) or return;
>>     my $person = $foaf->person;
>>     $meta->name($person->name);
>>     $meta->weblog_url($person->weblog);
>>     if ($person->mbox && $person->mbox =~ /^mailto:(.*)$/) {
>>         $meta->email($1);
>>     }
>> }
>> 
>> sub _profile_feed {
>>     my $class = shift;
>>     my($meta, $uri) = @_;
>>     my $xml = _fetch $uri or return;
>>     my $feed = XML::Feed->parse($xml) or return;
>>     $meta->weblog_title($feed->title);
>>     $meta->weblog_url($feed->link);
>>     $meta->name($feed->author);
>> }
>> 
>> 1;
>> 
>> _______________________________________________
>> yadis mailing list
>> yadis at lists.danga.com
>> http://lists.danga.com/mailman/listinfo/yadis
>> 
>> 



More information about the yadis mailing list