Information Retrieval Jujitsu

On my drive to work this morning, as I mentally began preparing for all the research I wanted to accomplish today, I started thinking about the relationship between information retrieval, machine learning, probability, and statistics.  And I found myself wondering how most of us think about machine learning when we use it as a tool to help people find information.  Specifically, do we use machine learning to help us discover the repeating patterns, the most likely outcomes, and then serve those sorts of results to information seekers?  Or do we use machine learning as a way of helping us understand what the most common outcomes are, so that we can develop information seeking systems that allow users to sidestep these common pathways, finding interesting nuggets that are otherwise obscured by the raw statistics?

Expressed in a slightly different manner: Is machine learning (applied to information retrieval) like karate in that it attacks the information organization/information seeking problem head on, fighting probability with probability?  Or is it more like jujitsu, in that it uses probability’s weaknesses against itself, to come up with retrieval algorithms and solutions that satisfy a user’s information need, despite the tendency to follow the statistically most-obvious path?  Or is there a little of both?

I realize my question is a bit vague and underspecified.  It is just something I am pondering at the moment.

This entry was posted in Information Retrieval Foundations. Bookmark the permalink.

2 Responses to Information Retrieval Jujitsu

  1. Good Question,

    I think it has to be a bit of both worlds.

    I’ve been thinking recently about the possibility of a recommender deliberately programmed to ignore your biases. For example if you are always reading about politics, perhaps it would give you something about science. I guess the same idea could be applied to search.

    What if you mixed up the kinds of results you gave, eg. Sometimes boost sites like wikipedia, other times deliver more academic content? Ie. the system would help you over come biases you didn’t know you had?

    You might want to give the user choice on how this functionality works, but it would be an interesting feature.


  2. jeremy says:

    James — good point about ignoring biases.

    I did my grad work in music information retrieval. And one of my committee members, Don Byrd, often had an interesting point as well. He used to say many years ago that he did not want a music recommender that gave him the most similar songs or artists, based on his likes and listening habits. He wanted a recommender that introduced music as different as possible to everything already in his collection. An anti-recommender or “show me something I wouldn’t have found any other way” recommender. I always liked that idea. I suppose its rather similar to “overcome my bias” personalized search system (or would that be anti-personalized?)

Leave a Reply

Your email address will not be published.