The Semantic Web isn’t Pedantic Enough
Making sense of the vast amounts of information available on the World Wide Web is no mean feat and the likes of Google, Yahoo and now Bing do their best but let’s be honest – SERPs are so 2008.
The aim of the semantic web is for the machines to lucidly understand the relationships between bits of data so that when the user is searching for something online the results returned are absolutely spot on. Some surfers think we’re nearly there, however my own browsing experiences lead me to believe that we’re still far from it.
After a recent trip to Colombia I started reading books with a desert island/shipwrecked theme; The Beach, The Coral Island, The Story of the Shipwrecked Sailor etc. When searching for titles under this genre neither on Wikipedia nor on Amazon was there a specific category dedicated to it. Amazon have their “Customers who bought this item also bought” feature however for my particular search this was flawed. The suggestions provided me with titles that were not on a desert island theme but by the same publishing house. Similarly Wikipedia did not have a shipwrecked category, which surprised me as it aggregates titles under all range of categories from London literature to Transgressive fiction.
Would it be difficult to develop a commercial service which mines the contents of the book looking for relevant keywords such as beach, shipwreck and pirates for example? This would require having the entire book uploaded to the site such as Amazon, which is no simple task especially with copyright restrictions. But Amazon already has the copyright to many titles because of the Kindle and Google wants to digitize every book ever, so an app surely is not beyond the realms of possibility?
Similarly, during some keyword research I was checking out terms around textiles and I wanted other suggestions similar to textile like; fabric, material and upholstery but the variations on the original word didn’t deviate in the slightest, they were limited even on the broad match setting. Google’s keyword tool simply generated types of textile, woven, silk, cotton etc. This was not only disappointing but I was genuinely surprised at the lack of variation and the tool’s failure to understand what I was looking for.
Semantics is an interesting area of language and on the web it’s becoming more and more important. But people’s understanding of words can differ greatly from individual to individual which is essential in terms of identity and humanisation. So can computers ever hope to have the same level of inference? In a robots rule the world kinda way I hope not.
In its most basic form the semantic web should coherently categorise the data the internet has to offer, but currently it fails to understand the simplest terms and make the easy connection between them. It would seem that Skynet is a long way off yet.
























July 16th, 2010 at 4:10 pm
Interesting article – the semantic web is a step in the right direction and you are right in saying we have a long way to go!
The IEML (Information Economy Meta Language) is one project which is looking into just these sort of issues.
The trouble with searching for keywords is that they aren’t in context, so the IEML provides a markup language (http://www.ieml.org/english/elements.html) in order to provide the machine with more information about what the text actually means.
So the idea is you could ask the net “what films are about desert islands?”, and questions like “what desert island film did most people like, and what did they most like about that film?”.
July 16th, 2010 at 4:23 pm
Brilliant, thanks for bringing the IEML to my attention Steve, that’s pretty much exactly what I’m talking about. Going to check it out now nice one!