blog

Big news for the Semantic Web

May 25th, 2009 by Peter

The last couple of weeks has seen some big announcements in the Semantic Web. Google announced a number of new features at Searchology 2009. And Wolfram|Alpha soft-launched a new alternative to the search engine, what they’re calling a computational knowledge engine.

These two represent two very different approaches to the semantic web. The web has traditionally been designed and written for humans. The semantic web represents an effort to design and write the web for machines. We can run a search, look at certain results in context, and determine that one particular string is an address, another is a phone number, another is a name. But a machine struggles to tell those apart. The “lower-case” semantic web attempts to introduce some structure to the human web in order to assist a machine’s interpretation. The strong “upper-case” Semantic Web introduces data specifically designed for machines, oft-times invisible to humans.

The Microformats effort attempts to extend the human web by encouraging the adoption of common semantic structures in order that a machine might also read this web. Google’s recent announcements included initial support for the hCard and hReview microformats, used to identify names and addresses (hCard) and reviews (hReview). Initially, Google are reading the hReview microformat to pull ratings for businesses and include that information in search results.

While it’s only in the initial stages, we’ve seen Google slowly extend search results with rich snippets in the past, and this is an indicator that they’re taking microformats seriously. This is a strong case for adding hCard (or other microformats) to your website — it’s not hard to imagine Google pulling address data and showing maps next to search results. This is something we’ll be adding to the Gruden website in the coming weeks, and we’ll include details of our implementation. In the meantime, Google’s webmaster documentation includes detailed examples to markup business and organisation data.

Another new Google feature sits between the lower-case and upper-case forms of the semantic web. Google Squared (link is to a YouTube video) lets you setup a grid for comparative search results. Eg, a search for Hotels might return different hotels down the page, with locations, prices, ratings across the page. In typical Google fashion, all this data is mined from the human web, but there’s an obvious attempt to impose machine-readable structure on the content.

Wolfram|Alpha has been compared to Google, but really is a radically different approach, and for a very different end. Wolfram call it a “computational knowledge engine”. It works with strongly structured data — it’s designed to give answers, not a list of results a human then needs to mine. Some example searches are:

And then there are the classic unanswerable questions (via Amnesia blog):