Add Semantic Richness To Your Markup With (RDF) Ease

park-bench
Image credit: Paul Goyette

While everyone agrees that semantics are useful, and are becoming more so all the time, few people agree on how to develop and use them. John Allsopp’s recent article on Semantics in HTML 5 pragmatically highlights a number of problems inherent to the limitations of HTML, and the lack of an extensible mechanism for future semantics is chief among them. Another question that John’s article raised relates to the mechanism for encoding semantics on the Web itself.

Recently, a new mechanism for adding semantics to (X)HTML pages called RDF–EASE (short for RDF Extracted Attributes from Styled Elements) has been getting some attention. Although it’s very early yet, RDF-EASE, which can be mixed with CSS, has some intriguing characteristics that may give us a glimpse into the future of semantic Web technologies.

How far away from “real content” is too far for semantics?

RDF-EASE decided to borrow heavily from both RDFa and CSS. The result is a CSS-like abstraction of semantics from a document’s underlying markup. On the draft specification page, RDF-EASE’s ten second sales pitch is:

CSS is an external file that specifies how your document should look; RDF-EASE is an external file that specifies what your document means.

For example, this means you could replace your reading list marked up in RDFa like this

<ul>
    <li typeof="biblio:book"><cite property="dc:title">One Hundred Years of Solitude</cite> by <span property="dc:creator">Gabriel García Márquez</span></li>
    <li typeof="biblio:book"><cite property="dc:title">The Alchemist</cite> by <span property="dc:creator">Paulo Coelho</span></li>
</ul>

with POSH HTML (devoid of any RDFa whatsoever) and paired with RDF-EASE like this:

ul li {
    -rdf-typeof: "biblio:book";
}
ul li cite {
    -rdf-property: "dc:title";
}
ul li span {
    -rdf-property: "dc:creator";
}

Developers or (more interestingly) automated scrapers could use XHTML documents with a layer of RDF-EASE on top to automatically create RDF documents for whatever machine-readable purpose they wanted to. Moreover, changing ontologies in the future won’t require any changes in the markup. Since RDF-EASE is, ostensibly, easier for web designers to produce than other forms of semantic markup, it could serve as a catalyst for bringing semantics to more of the Web.

While “more semantics as a good thing” is not being challenged, RDF-EASE does introduce a conceptually problematic notion for me: it moves the semantics of a document further away from the content itself. Some cons I see for RDF-EASE related to the fact that it is “CSS-like” are that it:

potentially violates “separation of concerns” by mixing presentation with semantics, since it allows the possibility to embed semantics (in the form of RDF-EASE) directly into otherwise presentational style sheets.
creates an additional layer of indirection and could therefore further obfuscate semantic issues, and actually make semantic ideas harder for novices to pick up rather than easier.
doesn’t provide the same kinds of backwards compatibility as RDFa does, since subtle, necessary differences in cascading rules and syntax means off-the-shelf CSS parsers are not really suitable for RDF-EASE.

Again, the potential issue I see comes in the form of the explicit separation of the semantics of content from the content itself. The beautiful thing about microformats and RDFa is that they’re right where your content is, so both humans and machines can get what they need from the markup in the same file. This reduces the complexity (and therefore the cost) of CMS programs and other software, and—I argue—the potential for human (developer) error.

Then again, perhaps separating semantics from the content’s XML-ish markup isn’t such a big deal. Things with RDF-EASE are not entirely negative. Some of the pros I see for RDF-EASE are that it:

is easier for more people to understand and pick up, thank to its somewhat simpler scope and more familiar CSS-like syntax (the alternative is typically XSLT).
saves keystrokes by cascading semantics onto groups of elements as opposed to requiring semantic attributes to be applied to individual elements as RDFa does.
enables greater compatibility and easier conversions between microformats and RDFa (by providing a nearest-ancestor() “scoping mechanism”).

There are arguments for both sides and these days we’re putting semantics just about everywhere we can, including elements (like HTML5’s proposed header, section and aside), attributes (like rel and rev), and attribute values (like title in abbr elements). Therefore, given the obvious need for future extensibility, the question becomes whether the purposes of these semantics are distinct enough that they warrant their own interfaces. Is it good that semantics seem to have found their way into all these technical mechanisms, and now perhaps even “semantic style sheets,” or is that bad? Quite honestly, I don’t think anyone knows yet.

Could RDF-EASE be the catalyst we need for “personalized ontologies”?

All that being said, I wonder if the greatest use case for RDF-EASE and its semantic abstractions actually borrows from CSS in a most unexpected place: user style sheets.

One of CSS’s promises is the personalization of Web content by the visitor. Sadly, this has yet to become commonplace. Nevertheless, user style sheets can do amazing things to web sites and are an incredibly powerful mechanism to personalize the way the Web looks to you.

Perhaps, then, RDF-EASE’s greatest use case is in the creation and distribution of personalized ontologies. Rather than being forced to rely on a document author’s semantics, RDF-EASE could—theoretically—give you a way to personalize the meaning of a site’s content, similar to the way user style sheets let you personalize a site’s appearance. If tools were built flexibly enough to allow plug-and-play vocabularies and ontologies, perhaps you could even use RDF-EASE to personalize the way a site behaved.

At first this may not sound like much more than ticking preference boxes, but as we move further into an age of automated tools that fundamentally depend on document semantics, making it easier for users to define their own semantics will undoubtedly prove just as important to principles like free speech and self-expression as the Creative Commons movement is for “remixing” today and open source initiatives were for the software industry.

Don’t like that your relationships are defined as “it’s complicated”, or that you’re asked for a gender on thousands of sites where your only options are “male” or “female”? Maybe if these sites used semantic ontologies as the model for that datum you would be able to write your own RDF-EASE transformation to override the web site’s options with the ones you provide. Further, since your choices would be linked up with other ontologies, suddenly your tools would interact with you in ways that were more meaningful for you, without needing the developer’s prior awareness or blessing for you to do this.

Now, that’s what I’d call the personalized Web.