Lookup is to Exploratory Search as P is to NP

Daniel T. has an interesting bipartite use-case model for exploratory search:

I know what I want, but I don’t know how to describe it. I don’t know what I want, but I hope to figure it out once I see what’s out there.

Perhaps this is a silly analogy, but framing the problem in [...]

The Tyranny of Simplicity, Redux

One of my ongoing research interest areas is in retrieval interfaces that allow more expressive and powerful statements of a user information need.  In that spirit, I wrote a minor rant last April about how the Apple iTunes smart playlist creation interface sacrifices functionality in the interest of simplicity.  One could only create smart [...]

More Information Is Positive

Via Greg Linden, I came across this interesting quote from Eric Schmidt about the obligation to help newspapers succeed:

Finally, Eric claimed Google has a moral duty to help newspapers succeed:

Google sees itself as trying to make the world a better place. And our values are that more information is positive — [...]

The Craft of Storytelling

I’ve been playing around with some old TREC data over the past few days and completely by chance I came across this document.  I find it interesting because storytelling is a good metaphor for what we as researchers do when we construct interactive information seeking systems.  The document is short enough that I think [...]

Good Interaction Design II: Just Ask

Last March I pointed out a short piece by Tessa Lau about how good interaction design trumps smart algorithms.  Today I have a followup.  In particular, Xavier Amatriain has a good writeup of the recently concluded Netflix contest.  Some of the lessons learned by going through the process are related to the importance of good evaluation metrics, the effect of (lapsed) time, matrix factorization, algorithm combination, and the value of data.

Data is always important, but what struck me in the writeup was his discovery that the biggest advances came not from accumulation of massive amount of data, log files, clicks, etc.  Rather, while dozens and dozens of researchers around the world were struggling to reach that coveted 10% improvement by eking out every last drop of value from large data-only methods, Amatriain comparatively easily blew past that ceiling and hit 14%.

How?  Continue reading…

Tomorrow’s Data

Jeff Dalton recently wrote about why he doesn’t want your search log data.  It is an interesting read, and I recommend going through the whole article and comments.  But I want to call attention to one thought in particular:

Academia should be building solutions for tomorrow’s data, not yesterday’s. What will the queries and [...]