Showing posts with label xml. Show all posts
Showing posts with label xml. Show all posts

Sunday, March 23, 2008

How to make OpenID really rock (user signup process)

Robby finds himself asking why come I have 75 openids, and why can't I just transfer my details from site to site?

Well Robby, this is for you.


First, you need to grab XML_GRDDL. It's fairly stable at the moment, and a PEPR proposal.

For now, do a

pear install http://xmlgrddl.googlecode.com/files/XML_GRDDL-0.0.4.tgz

... but if it gets through PEPR, this will be easier.


Be warned, you'll need version of PHP 5.2.5+ unless you can compile php with a decent version of libxml / libxslt.

You also need the XSL extension.

Open up your favourite editor.

Paste in:

require_once 'XML/GRDDL.php';

/**
* Example: Fetch multiple URLs about a specific user
* and get useful information out.
*/
$urls = array();
$urls[0] = 'http://flickr.com/people/clockwerx/';
$urls[1] = 'http://www.linkedin.com/in/clockwerx';
$urls[2] = 'http://www.last.fm/user/CloCkWeRX/';
$urls[3] = 'http://clockwerx.blogspot.com/';

//For each URL, pretend it has these urls in <head profile="foo" />
//These look for hcard, hcalendar, etc.
$profiles[$urls[0]][] = 'http://www.w3.org/2002/12/cal/cardcaletc';
$profiles[$urls[1]][] = 'http://microformats.org/wiki/hresume-profile';
$profiles[$urls[1]][] = 'http://www.w3.org/2002/12/cal/cardcaletc';
$profiles[$urls[2]][] = 'http://www.w3.org/2002/12/cal/cardcaletc';
$profiles[$urls[3]][] = 'http://www.w3.org/2002/12/cal/cardcaletc';

//Set what kind of transformations we're interested in.


$options = XML_GRDDL::getDefaultOptions();
$options['quiet'] = true;

$grddl = XML_GRDDL::factory('xsl', $options);
$results = array();
foreach ($urls as $n => $url) {
$data = $grddl->fetch($url);

$data = $grddl->prettify($data);

$modified_data = $grddl->appendProfiles($data, $profiles[$url]);

$stylesheets = $grddl->inspect($modified_data, $url);

$rdf_xml = array();
foreach ($stylesheets as $stylesheet) {
$rdf_xml[] = $grddl->transform($stylesheet, $modified_data);
}

$results[$url] = array_reduce($rdf_xml, array($grddl, 'merge'));
}

print "We scuttered " . count($urls) . " urls and found these results\n";
foreach ($results as $url => $rdf_xml) {
print $url . "\n";

$sxe = simplexml_load_string($rdf_xml);
$sxe->registerXPathNamespace('vcard', 'http://www.w3.org/2006/vcard/ns#');
$sxe->registerXPathNamespace('ical', 'http://www.w3.org/2002/12/cal/icaltzd#');

print "We found the following pieces of information, choose which are yours:\n";
$xpaths = array();
$xpaths["Formatted name"] = '//vcard:fn';
$xpaths["First name"] = '//vcard:givenName';
$xpaths["Last name"] = '//vcard:familyName';
$xpaths["Email"] = '//vcard:email';
$xpaths["Homepage or URl"] = '//vcard:url';
$xpaths["Workplace name"] = '//vcard:organization-name';
$xpaths["Photo URL"] = '//vcard:photo';
$xpaths["Locality"] = '//vcard:locality';
$xpaths["Position/Title"] = '//vcard:title';

foreach ($xpaths as $name => $xpath) {
$results = $sxe->xpath($xpath);
if (empty($results)) {
continue;
}

print $name . ": ";
foreach ($results as $node) {
print trim((string)$node);
$attributes = $node->attributes(XML_GRDDL::RDF_NS);
if (!empty($attributes['resource'])) {
print trim((string)$attributes['resource']);
}
print "\n";
}
}
//print $rdf_xml . "\n\n";
print "\n";
}


Save it, run it.

You *should get*:

---------- PHP ----------
We scuttered 4 urls and found these results
http://flickr.com/people/clockwerx/
We found the following pieces of information, choose which are yours:
Formatted name: DanielO'Connor
First name: Daniel
Last name: O'Connor
Homepage or URl: http://clockwerx.blogspot.com/
Locality: Klemzig
Position/Title: Web Developer

http://www.linkedin.com/in/clockwerx
We found the following pieces of information, choose which are yours:
Formatted name: Daniel
O'Connor
Adelaide Institude of TAFE
PEAR member
LIXI Members member
First name: Daniel
Last name: O'Connor
Homepage or URl: http://http;//clockwerx.blogspot.com
http://www.valuationexchange.com.au
Workplace name: PEAR
Valuation Exchange
Fresh FM
Self-employed
Adelaide Institude of TAFE
PEAR member
LIXI Members member
Locality: Adelaide Area, Australia
Position/Title: Developer at Valuation Exchange
Contributer
Software Developer
Web Developer
Freelancer

http://www.last.fm/user/CloCkWeRX/
We found the following pieces of information, choose which are yours:
Formatted name: Daniel O'Connor
NoBloodForOil
Homepage or URl: http://clockwerx.blogspot.com
Photo URL: http://userserve-ak.last.fm/serve/160/682792.jpg
http://userserve-ak.last.fm/serve/50/690470.gif

http://clockwerx.blogspot.com/
We found the following pieces of information, choose which are yours:
Formatted name: Daniel O'Connor
Email: mailto:daniel.oconnor@gmail.com
Homepage or URL: http://clockwerx.blogspot.com/
xmpp:daniel.oconnor@gmail.com
Workplace name: Valuation Exchange


Output completed (30 sec consumed)


Now, how neat is that. You can grab any url which publishes microformats, grab out the hcards from it, grab the information from those, and viola! A pre-populated signup form.


Why is this neat?
* If an OpenID url has Microformats, bam! You can read it.
* If you are a bit more hardcore, you can hook up xOperator and a triplestore to this information.
* Or you could use it in Drupal.

There you have it, ladies and gents: semantic web in a box, with practical applications for user signup.

Tuesday, March 11, 2008

Microformats vs Machines

I've been toiling away in the background for a few weeks now, slowly pushing XML_GRDDL through its paces, and the GRDDL spec tests.

It's finally at a point I'm mostly happy with - it fails some xml:base related tests, and doesn't like content negotiation - though it's hard to test against a misconfigured setup.

So now that I've hit a happy place, I thought I might as well do some explorations - particularly of microformats to RDF.

So, who's got real world microformats? Flickr! Upcoming! Hurray! Or maybe not quite GRDDL friendly formats.

Not even Tantek really bothers with profiles - apart from XFN.

Now I'm about stumped. How do I convince the vast majority of microformat users to look sideways at a //head[@profile] containing GRDDL friendly information?

Especially since the driving matra behind microformats is people first, machines second

Thursday, October 18, 2007

Government, and Software, Department of Transport Day

I made a whinge into an email - the adelaide metro site sort of works, and lets me plan a route. It provides me with information, stuck into a PDF.

I can't remix it, it's not open, I can't copy and paste it in a meaningful way.

So, I wrote a letter:
  • Can I have some data?
  • I want to make a google transit type application
The answer came back:
  • No, we don't allow derivative applications
Huh? You guys do know about jNomad, right? The Adelaide metro's timetable on your phone project?

I didn't want to take the no I got for an answer.
  • I pay taxes
  • Adelaide Metro is run by the Department of Transport, Energy, and Infrastructure
  • Taxes made the data
  • The Adelaide Metro site says just ask if you want to use our copyrighted data.
So, I wrote this:

Hi again,

First off, I just want to assure you I'm not trying to be pushy and rude, because I absolutely hate it when people do that to me. If anything below comes off like that, please be assured it wasn't my intent at all; I'm just somewhat enthusiastic about open data & web applications.

The South Australian Government - Department of Transport, Energy and Infrastructure - Public Transport Division (DTEI-PTD) does not currently distribute this type of data set for the purposes of development of derivative products, for example the nextbus application quoted below.


Are there any kind of circumstances in which you guys would?
I'm aware of what's been done with jnomad; but I don't have access to it (not owning a phone). I think it's a great idea to improve the use of public transport and keep the public informed.


I have a problem though. My problem is that the Adelaide metro site has a trip planner - and it's reasonably good. It does most of what I want, most of the time. For the rest, I have to pick up a phone and hassle your call centre staff.

The trip planner can work out that I want to go from A to B; and tell me the timetables to read to do it. Unfortunately, that's a lot of
  1. Input data
  2. Download multiple pdfs
  3. Compare each
  4. Find the right time, date, bus number
  5. Repeat until I get it right
It's very hard for me to do something like: find a restaurant, find the relevant transport options, then make a booking. There's a considerable amount of effort involved in merging the two bits of data.

Using something like google maps/google transit ( http://www.google.com/transit); I can get a lot more information a lot quicker; and I can visualise it ("Oh, I can get off the bus right near X", and "The restaurant is located at X, I need directions from my house to there").

It's really frustrating to know that all of the data is there, tucked away, and have no easy way to use it together.



As a php developer; and knowing that the Adelaide metro site (and trip planner) is written in PHP ; it doesn't seem to be a huge (design/technical) problem to produce plain text data[1] from whatever database you guys are using behind the scenes.

I'd suggest that given a little over two weeks full time, a reasonably competent PHP developer could read information from that database and produce the data.

I can't imagine any scenarios in which sharing this data does harm to the Adelaide metro/transport department - most programmers don't run rival bus companies. If anything, I'd suggest this kind of thing achieves some of the spirit of the 2006 e-Government Strategy [2]. With careful licensing you could ensure non-commerical use only.

So; my questions are:
  • What licencing agreements were reached with jnomad's solution provider Laborotech?
  • Would the DTEI ever consider similar agreements with Google [3]?
  • Would the DTEI ever consider similar agreements with private citizens (in the form a of developer key or written agreement)?
  • Would the DTEI ever consider open licences for their data[4]?
  • Would the costs of developing either of the above be considered prohibitive; or outside of the current IT resources available?
  • Are there more appropriate channels to pester with these kinds of questions?

[1] http://code.google.com/transit/spec/transit_feed_specification.htm#transitFeedSubmit
[2] http://www.agimo.gov.au/publications/2006/march/introduction_to_responsive_government
[3] http://www.google.com/intl/en_ALL/help/faq_transit.html#adding_an_agency
[4] http://creativecommons.org/licenses/by-nc-sa/2.0/


which got me an invite:

Thanks Daniel for your comments. I would like to meet with you to discuss the content of this email further, and while we are at it show you the great things we are doing.



Wow! I thought. I've pierced the layers of red tape and communicated with this person, who has the ability to act on things.

So today, I went in for the meeting.

When I came out, I'd never felt so dismayed with government before in my life.


The subtext of the entire meeting went:
I didn't read your email fully. You are one of those annoying people, and I've brought you in here to shut you up. Since you are a young person, I also must speak to you like a 5 year old, which you obviously are, on account of your youth. Q.E.D.
The spoken meeting went more like:
  • We are already doing that internally! Here's an internal web application, which I'll click through, on my side of the desk, and not actually show you. IMPREZZLED? I AM!
  • Open data? What's that? Obviously, we can't let you use any of our data. I mean, who do you think you are, someone who's paid for this kind of thing? You are John Q. Public, and I can't imagine a single use for you to have this data. No use for you, despite the fact there's enough data to keep me in a manager's position at this department, and manage all sorts of people who spend all the time in the world working with it.
  • What do you mean, "is Jnomad a commerical company who sought us out, and sold us a solution, or is it a company that responded to a tender"? We have a contract. It's spelt... k-o-n-t-r-u-c-t. You are john q public, and we don't make contracts with you, you need money before we'll talk.
  • Don't worry, concerned member of the public. The government has it all under control.
  • Since you are using big words in your questions, if I furrow my brow and talk about something else, I don't have to answer the original question.
  • Oh, you're leaving after only 8 minutes of this? IF YOU EVER WANT A JOB YOU HAVE MY NUMBER! WOW I HAVE REALLY FIXED THE PROBLEMS OF THE WORLD TODAY!

There was only snippets of information that were actually valuable amounted to:
  • We don't know what open government is
  • Data is OURS, ALL OURS!
  • ... and one other which I won't mention here, for a while.

I should have known, as this is the same government who used EDS to build a registration site - when launched, it had "click here" in plain text on the front page.

So I left, knowing that there was no way I could ever communicate what I was trying to this person.

In other news:
  • Renewing my drivers' license didn't suck
  • Except the website misled me about the cost (it said: $30 to renew, periods of 1-10 years available), not ($30 per year, so you can register for 10 years ($300))
  • The new tram has an awesome tram stop
  • It's neat that its free for some sections
  • For all of the money spent on the damned thing, the tram seats are rock hard. The old tram was far more comfortable.