Structured data - set to take on Google

After posting the Yahoo microsearch entry last night I had a thought. Google has already mastered web 1.0 style search. I use it all the time (even if I am not always thrilled with the results), rarely do I use Microsoft or Yahoo or anything else when I just want to do a quick search. So how can others compete? Even if Microsoft and Yahoo merge and Microhoo emerges, how can they take on Google. Well, with basic search I don’t think they can.

This is when Yahoo microsearch jumps up and starts waving its arms… “Why do a search on conference in San Francisco on Google and get this”:

“When you can do the same search on me, microsearch and get this:” (not sure why microsearch speaks in the first person)

Note the map, the timeline and the little bar sitting above it all. It shows the amount of extracted metadata – the really useful bits. In the above search it pulled out 16 vCards and 182 events. It also makes all this metadata goodness available as RDF if you click the little box logo located at the top left.

Of course nothing stops Google from offering this (I am sure they are already thinking about it) which would be great. The thing that levels the playing field is the embedded structured data (microformats, RDF, RDFa, eRDF). This semantic data gives important information that doesn’t require amazing guesswork. No longer does a search engine have to guess what was meant, the author can explicitly say what they meant. That is an important difference. A critical difference.

I know this opens up a HUGE can of worms for spam, but with social networking and trust systems (FOAF, OpenID, OpenAuth) it can be counteracted. All embedded data will be indexed, but the new, future web search engines will have trust algorithms as their secret sauce. A new “Do I trust this?” or “Does my network trust this” filter will sit on top of this mass of information.

Google works with secret recipes that do a pretty good job with the huge mass of unstructured information, but when the importance of that starts to diminish where does that put Google? I am not going to shed any tears for them yet, it is still early days, but Yahoo microsearch is leading in the right direction.

The classic chicken and egg syndrome looks to finally be resolving itself; no one will go to the effort of embedding structured data if no one is using it or asking for it. Making this data a first class citizen will answer that problem. Microformats and Semantic Web metadata formats can easily live happily with each other and in fact can leverage each other strengths to make the web a more usable place for everyone.

Structured data – set to take on Google