On my drive to work this morning, as I mentally began preparing for all the research I wanted to accomplish today, I started thinking about the relationship between information retrieval, machine learning, probability, and statistics. And I found myself wondering how most of us think about machine learning when we use it as a tool to help people find information. Specifically, do we use machine learning to help us discover the repeating patterns, the most likely outcomes, and then serve those sorts of results to information seekers? Or do we use machine learning as a way of helping us understand what the most common outcomes are, so that we can develop information seeking systems that allow users to sidestep these common pathways, finding interesting nuggets that are otherwise obscured by the raw statistics?
Expressed in a slightly different manner: Is machine learning (applied to information retrieval) like karate in that it attacks the information organization/information seeking problem head on, fighting probability with probability? Or is it more like jujitsu, in that it uses probability’s weaknesses against itself, to come up with retrieval algorithms and solutions that satisfy a user’s information need, despite the tendency to follow the statistically most-obvious path? Or is there a little of both?
I realize my question is a bit vague and underspecified. It is just something I am pondering at the moment.