Semantic tagging with Calais and Drupal
Gosh. It just can’t get any easier than this. Automatically and (pretty) intelligently tagging your content is a very powerful thing. Especially if those tags are structured tags and not just text strings. This used to be a fairly painful proposition, manual tagging — do I use folksonomy or do I create meaningful taxonomies. And will this scale??? Well fret no more. Let someone (or something else) do it for you.
I have been using the Drupal Calais tagging module for the past month and have been pleasantly surprised at how effective it has been. I have thrown emails, blog posts, random text from the Web. It all comes back with at least one decent tag (and that was a pretty murky test sample).
Today, Thomson Reuters has released version 2.0 of their Calais webservice. This is cool in many ways:
- They now tag many more non-news items (EntertainmentAwardEvent, MedicalCondition, Movie, MusicAlbum, MusicGroup, PublishedMedium, SportsEvent, SportsGame and TVShow). Previously their focus was on news events so most non-news posts received few tags.
- Tools to allow embedding of metadata for Yahoo! SearchMonkey
- WordPress tagging plugin
- Previously mentioned Drupal module
Installing the module is simple and only requires a few steps. First, download it from the link above. I used the version for Drupal 6, the install process is slightly different for Drupal 5.
Second: If you are using Drupal 6 you have to download the RDF module by the amazing Arto. This is has to be installed first before the Calais module. If you are using Drupal 5, you have to install the ARC2 library into the opencalais/arc_rdf/arc2 directory.
It really isn’t hard. Just download two files, uncompress them into your sites/all/modules directory and activate them.
Finally there is one last step before your posts are semantically tagged and your life is transformed. Select what content types you want Calais to do its magic on.
Here you have three options:
- Not processed by Calais (default)
- Have keywords be suggested (manual term association)
- Have keywords automatically applied *my favourite*
I set each of my types to the 3rd option as it is easiest (always good) and I don’t have to remember to do anything extra. Just set it and forget it. As soon as you save your post the tagging process begins. Although the tagging is quite fast, it does add a second or two for your content to save.
This new release of Calais really brings a lot of power at almost no cost to you — well, it did take about 8 minutes to install it. It is a no-brainer to get it and at least try it. The addition of the embedded metadata for SearchMonkey really propels this release into the stratosphere.
Now, for a bit of extra spice. Download the Tagadelic module and get super easy tag clouds based on all your hard work :)
And to help you keep it all together and browseable, how about getting Taxonomy VTN?
That will really wow your clients and you can tell them how hard it all was…