ifacethoughts

Clustering For Context

My interest in search engines had taken me to some reading on clustering. There are various experiments going on, some of them have even been transformed into products. Tara Calishain does a roundup of various options available. Alex Iskold did an overview of the clustering mechanim and its advantages.

It is really interesting, however, I would like to consider a different perspective. While searching the popular search engines return flat results by default. Flat meaning, there is not other dimension to it, they are results of the phrase search. If the search engine could identify the context in which I wanted to make the search, the results would be much more accurate. Instead of trying to guess the context, clustering provides you with the multiple contexts possible. It is like you search for a phrase, and then you are presented with another phrase by which your results are indexed.

For the technically inclined, checkout this tutorial on clustering. I am not sure if clustering is the best way to add the context. These contexts can be wrong if the phrase is not understood properly. One of the phrases I use regularly to test is book on google or book about google. This phrase usually gives more results from Google Book Search rather than about books on Google. I know natural language search is not very popular, but can considering the phrase rather than just the worlds and their order, together with clustering increase accuracy? Clustering is a popular way of organizing objects in groups, but can something that looks at the semantics help? Just a thought.

Discussion [Participate or Link]

  1. Homonyms, Nomenclature And Aspects on iface thoughts said:

    [...] Martin Fowler illustrates the effect of the natural language homonyms in modeling and design. A book can either mean the literary work or its physical body. What do we mean by book is usually communicated by the context of the conversation. One realisation I have had in my software development experience that it is important to capture this context, and at the same time it is one of the most difficult things to do. Search engines have used clustering to solve a similar problem. [...]

Say your thought!

Who are you?

If you want to use HTML you can use these tags: <a>, <em>, <strong>, <abbr>, <code>, <blockquote>. Closing the tags will be appreciated as this site uses valid XHTML.

freshthoughts

freshcomments

contactme

Abhijit Nadgouda
iface Consulting
India
+91 9819820312
Y!: anadgouda
GTalk: anadgouda@gmail.com
MSN: anadgouda@hotmail.com
Skype: anadgouda
My bookmarks

currentproject

Complete Wellbeing

badgesand...

This is the weblog of Abhijit Nadgouda where he writes down his thoughts on software development and related topics. You are invited to subscribe to the feed to stay updated or check out more subscription options. Or you can choose to browse by one of the topics.

Twitter - Trying out sakura terminal - http://www.pleyades.net/david/sakura.php