Daniel T. has an interesting bipartite use-case model for exploratory search:
- I know what I want, but I don’t know how to describe it.
- I don’t know what I want, but I hope to figure it out once I see what’s out there.
Perhaps this is a silly analogy, but framing the problem in this way reminded me abstractly of P vs. NP. Some problems can be both computed and verified in polynomial (P) time. Other problems can be verified in P time, but it is unknown whether a P-time solution to the problem exists. These are the non-deterministic polynomial set of problems (NP). In the worst case, it might take exponential time to get the answer.
Google and related web search engines are lookup, navigational, known item engines. You can both obtain and verify your answer in polynomial time. Linear even (hence the classic ranked list).
With an exploratory information need, the satisfaction of your information need can be verified in polynomial time. It doesn’t take too long to examine the assembled set of summarized/contrasted/accumulated information and tell whether or not your information need has been satisfied. Maybe it doesn’t take constant time, but it certainly can be accomplished in linear time. But accumulating that information in the first place? It is generally unknown how long that will take, as there is a bit of non-determinism in the information seeking pathways that you need to traverse. Exploratory search is, dare I say, NP.
So the big question: Is P = NP. That is to say, can one use a tool such as Yahoo!, Google, etc. which has been generally optimized for lookup, P-time problems and use it to satisfy one’s exploratory information seeking task? Certainly one can try and use these tools in this manner. Nothing stops a user from entering vast quantities of queries and accumulating the necessary set of information themselves. But the tool has not been designed for that purpose. So can it really be used to solve that problem? Are multiple iterations of lookup search capable of satisfying an exploratory information need? Does P = NP?
I don’t think that it does.
Question for the day: For certain classes of NP problems (e.g. the knapsack problem) there are often heuristic that yield good approximations (nearly-optimal) solutions in P-time. What are the analogous classes of problems in the exploratory information seeking domain? And how would we, in general, recognize them?