Comments on: Machine Learning and Search: Action or Reaction?

By: Information Retrieval Gupf » Speed Matters. So Does the Metric.

Information Retrieval Gupf » Speed Matters. So Does the Metric. — Mon, 29 Jun 2009 23:58:54 +0000

[…] Machine Learning and Search: Action or Reaction? […]

By: marianasoffer

marianasoffer — Thu, 28 May 2009 22:16:55 +0000

They also failed to keep up to date with the new trends, such as the stream, which created the need of real-time search. I think it is impossible for search engines to infer somehow things like this that are out of their scheeme function.

By: jeremy

jeremy — Thu, 28 May 2009 05:24:26 +0000

Yeah, let me clarify: For the majority of search engine scenarios, for the average of all users, the standard log analysis and A/B testing is just fine. There is no reason why it shouldn’t be the preferred method, in fact, give the scale of the web. I never intended to imply anything but it being the way to go, a decent percentage of the time.

It’s when you start to get to the long tail of information needs, the variety of users that want to use the information on the web for something other than looking up a home page. In those learning oriented, knowledge skimming- and synthesis-oriented, analysis-oriented types of search tasks, simply mining the logs may not give you the information you seek.

For example, one person (let’s say “User A”) might run 6 queries in a row, constantly clicking on the back button and trying another one, because they can’t find the “correct” query to get them to the known-item piece of information that they are seeking. That would be a failed search, using existing tools. However, another person (let’s say “User B”) might run 6 queries in a row, quickly skimming the summaries for the top few results to each query and then constantly clicking on the back button and trying another query, because user B is attempting to probe the conceptual boundaries of a topic area. That would be a successful search, but one in which the tool was used quite awkwardly, ie. the search engine really didn’t provide the right tool, so the user had to simulate a boundary-probing search tool, manually. Like using the prongs of a hammer to grab hold of and screw in a screw.. you can do it, but it ain’t pretty.

However, from the perspective of the log analysis, both users A and B look exactly the same. Both try 6 different queries, in rapid succession, and don’t click anything after running each query. But user A considered his/her search a failure, and user B considered his/her search a success.

So I can’t say for sure, because I am not at one of these companies. But it seems to me that it would be very difficult to tell these two types of users apart, to know for sure from the log analysis which if both users were of type A or type B or one of each. And even if you did user interviews, and discovered that user B exists, you still have to rely on log mining to know exactly how many people of type user B there are. And in that case you’re back to the old problem of knowing how to tell user A and user B apart in the logs, when their behaviors look exactly the same.

By: Information Retrieval Gupf » Wired Article on Bing

Information Retrieval Gupf » Wired Article on Bing — Wed, 27 May 2009 19:08:20 +0000

[…] which will supposedly be named Bing. It touches on some of the issues that we were discussing in yesterday’s comment thread, in particular: People thought online e-mail was just fine and more or less converged on the same […]

By: Jon

Jon — Wed, 27 May 2009 16:46:20 +0000

There’s certainly a huge utility to A/B testing and log analysis. A/B testing can very effectively support incremental system changes (A and B need to be comparable), and log analysis is essential for understanding how people use the existing system & possibly influencing what the incremental changes might be.

But, I agree with you that they’re not sufficient for understanding process failures. If you want to understand these with automatic machine learning methods, I’d guess that a much broader view of a “log” is need. For example, a capture all the actions associated with an information seeking task not just in the search engine but in the word processor, on your blog, on twitter, in your code editor, on your phone, and even personal interactions away from the computer. Its hard to imagine collecting this data at all, much less on a reasonably a large scale.

By: jeremy

jeremy — Wed, 27 May 2009 16:11:11 +0000

I would recommend watching Jon’s video, above, starting at about 16.30, the example with the measuring cup. No user could articulate that a problem with their current measuring process existed, until they were provided with a better solution.

By: jeremy

jeremy — Wed, 27 May 2009 15:57:03 +0000

Yes, exactly, Jon. Failure in the process, rather than failures within the existing system. Everybody raves about A/B testing and log analysis, but it seems to me that even the people doing the log analysis are only looking for failures within the existing system.. and if a searcher is trying to do something outside of the constraints of the system, no amount of machine learning, user modeling, or log analysis will detect that.

Everyone talks about how “user driven” they are, and so there must be some sort of process for understanding and measuring situations wherein the process itself is not working. That’s what I’ve been seeking to understand for many years now.

So is your Oxo example the answer? There really is no process for detecting “out of band” processes, and it is up to the search engine designers to come up with that stroke of creativity?

By: Jon

Jon — Wed, 27 May 2009 15:43:02 +0000

oops — the video is here:
http://vimeo.com/3200945

By: Jon

Jon — Wed, 27 May 2009 15:42:06 +0000

Looking at your feedback loop in the post, it seems like the typical target of refinement would be failures within the system itself. But, it sounds like you want to be looking at failures of the *process* that the system supports (or forces users into).

There’s an art to identifying those types of failures — as you’ve said, users may not know how to articulate what the failure may be, or may not even perceive it as a failure since they’re using the provided tool as it was intended to be used and as it is used by everyone else. Most people probably don’t even think there might be a better way, they just work around the provided constraints.

This is the case with all design, not just information seeking systems. Check out this video of the president of Oxo talking about their design process, especially the bit about their measuring cup. No one perceived the tool as lacking until the whole process was evaluated. Even then, it took a real stroke of creativity to break out of the established process.

By: jeremy

jeremy — Wed, 27 May 2009 14:33:59 +0000

Sorry, Fernando, that was a flip response. What I mean is, the user can’t always articulate what it is that the search engine is *not* doing, to help them find the information that they need. Take my ongoing “prague cafes” example (http://irgupf.com/2009/04/23/retrievability-and-prague-cafes/). If I were an average user, I might write something like “I don’t know if I’ve found everything that I need”.

But it’s quite a leap to go from that user statement, to the notion that some sort of exploratory interface and supporting algorithm is needed, never mind even what interface and algorithm would be best for satisfying this type of “off the beaten path” cafe search.

But yeah, as Daniel says, it could at least be a starting point to get engineers to think about new problems.