Jeff Dalton recently wrote about why he doesn’t want your search log data. It is an interesting read, and I recommend going through the whole article and comments. But I want to call attention to one thought in particular:
Academia should be building solutions for tomorrow’s data, not yesterday’s. What will the queries and documents look like in 5 or even 10 years and how can we improve retrieval for those? It’s not an easy question to answer, but you can watch Bruce Croft’s CIKM keynote for some ideas…I still believe in empirical research. However, I’m also well-aware that over-reliance on limited data can lead to overfitting and incremental changes instead of ground-breaking research. To use an analogy from Wall Street, we become too focused on quarterly paper deadlines and lose sight of the fundamental science.
It is a provokative thought, and I find it compelling. By spending too much effort paying attention to yesterday’s — and even today’s — data, you wind up limiting yourself to the existing, visible gradient. At the same time, an open question is how one develops for tomorrow’s data when that data by definition does not yet exist. This is a question that I hope to address more in the upcoming months. Not answer, but address. Most likely by pointing to work by other researchers not directly working on the IR task (as I’ve done a bit in the past). Developing for tomorrow’s data is not an easy task, but it is also worth not dismissing just because it is too far beyond the needs of today’s users.
There’s no doubt that the information economy continues to create a lot of wealth, but I think it’s fair to ask if it’s also creating enough science to replenish the stock of scientific capital that it’s still burning through. I think it’s clear that chaotic, market-driven change is a good way to bring ideas quickly and efficiently from concept to profitable product. However, such a rapid churning of the institutional and cultural landscape ultimately may not be conducive to the kind of steady, expensive, long-term investment in fundamental research that produces the really big ideas that somewhere, at some completely unforeseeable point in the future, change the world.