I recently became aware of a new initiative in the data portability arena – Open Data Definition (ODD). I applaud any effort to increase data portability between social networks and your data. But to really succeed the idea must be well thought out and research done to see what is out there and what really works.
Ben Werdmuller’s blog post introducing ODD defines it as a new format for import and export of data from social applications. He stresses that this project has risen from real world, not academic exercises. He makes a good argument for data portability; relating it the desktop taking a file created in one app and opening it in another. This is basic 101 here but the Web can’t do it — yet.
Where I feel he has seriously missed the boat is on the section where he talks about the Semantic Web. He describes the community as ambiguous and overcomplicated:
The semantic web community has RDF, a format designed for the purpose that is potentially powerful but – as one might expect from the semantic web community – prone to ambiguity and overcomplicated implementation.
And then the biggest problem I see with the entire argument comes from this:
… In small doses, it works (FOAF is based on a subset of RDF), but for more abstract data, it becomes exponentially harder to build for. Adding new data fields requires doing contortions in XML, which makes it harder to generate dynamically.
Now this is just plane odd. RDF is *not* XML. RDF is an abstract format that was built to be a scaleable data format for the World Wide Web (I stress world wide). RDF is built to model information of *any* shape and any size. It is quite simple really. Take three things (anything) and put them together and you have RDF. The example below is based on Turtle, an RDF format made for human writers. It is just one of many formats that all can be used to write RDF. RDFa is another way to write RDF. It embeds RDF into HTML, a bit like microformats but more extensible.
<http://data.boab.info/david/foaf.rdf#me> foaf:Name "David Peterson".
<http://data.boab.info/david/foaf.rdf#me> foaf:weblog <http://www.sitepoint.com/articlelist/497>.
<http://data.boab.info/david/foaf.rdf#me> foaf:based_near <http://dbpedia.org/resource/Townsville>.
With those three sentences I have now constructed this graph:
It is all pretty simple
Of course I can continue to add anything I want to <http://data.boab.info/david/foaf.rdf#me>, or if I would like I can add to <http://http://dbpedia.org/resource/Townsville>. Just add more things…
Real data portability
My FOAF file is my online identity. I own it, I can put it on any server I want. This is the ultimate in data portability. It lists who my friends are, my contact details, anything I want. Again, RDF is infinitely extensible.
The wonderful thing about RDF and linked data is that each one of the statements above can be retrieved via its URL. This is the basis behind linked data. I can traverse the graph be GET-ing the URL with my browser, with a REST call, etc. If you want to know more about where I live go to http://dbpedia.org/resource/Townsville. From there you can find out the geo coordinates and pretty much anything you would want :)
The new state of the Web
The premise that this stuff is too hard and complicated doesn’t wash anymore. It might have a few years ago, but with Yahoo throwing their full support behind Semantic Web standards and the upcoming version of Drupal 7 to have an RDF linked data core, it definitely is out of touch.
The thorough blog post by Henry Story (foaf profile) entitled "Proof: Data Portability requires Linked Data" goes into much greater detail and I encourage anyone who is working in the data portability space to have a read and to also subscribe to the DataPortablity general discussion on Google Groups.
Any attempt to make data portability actually work needs to take into account RDF which is the fundamental building block of the Semantic Web. And don’t take my word for it, here is a quote from another blog: