More and Faster versus Smarter and More Effective

Last month, in reaction to the “Unreasonable Effectiveness of Data” paper that made the rounds, Stephen Few from the Business Intelligence community wrote an interesting post:
The notion that “we need more data” seems to have always served as a fundamental assumption and driver of the data warehousing and business intelligence industries. It is true that [...]

The Tyranny of Simplicity

One of my ongoing frustrations with modern, consumer-facing information organization and retrieval systems is the way in which functionality is often sacrificed in the name of simplicity.
Full functionality under the rubric of simplicity is a laudable goal, and I would agree that this is where we all eventually want to end up in the information [...]

World Pinhole Photography Day

While the focus of this blog is the retrieval of existing information, from music to images to videos to text, every once it a while it is nice to create new information as well.  In that spirit I decided to participate in World Pinhole Photography Day, which is today, Sunday April 26, 2009.  While I [...]

Retrievability and Prague Cafes

A week or two ago I began writing a few thoughts about large-data based algorithms and retrievability.  It was spawned by the Unreasonable Effectiveness of Data position paper by a couple of notable Googlers, which then led to a brief discussion.
My main contention was that by relying to heavily on algorithms that are based solely [...]

Google Similar Images: Only 20%?!

A few days ago, Google launched “similar image search” functionality.  From TechCrunch:
A new 20% time Google project has just launched called Google Similar Images. It’s pretty self-explanatory — when you search for an image and find one close to what you’re looking for, Google can now find ones that it believes to be the same, [...]

Dagstuhl Seminar on Content-Based Retrieval

As a researcher, it is occasionally quite interesting to reread thoughts and positions that I’ve taken in years and works past. Sometimes I can observe a marked shift from my previous thinking; avenues or approaches that I once considered fruitful I now no longer do. And sometimes I can observe hints and seeds of my [...]

“Improving Findability” Falls Short of the Mark

Via Tim O’Reilly on Twitter, I came across this article by Vanessa Fox on how government can improve the findability of their web pages, and thereby allow citizens to become better informed and government to be more transparent.  Fox writes:

Universal, Google launch ‘Vevo’ Music Service

From Wired:
Vevo will launch later this year, a collaboration between Universal Music Group and Google the partners expect to be the leading music video service in the world from day one. Google confirmed to Wired.com Thursday that all of Universal Music Group’s video assets (music videos, interviews, concert footage and possibly Kyte-style backstage video) [...]

Retrievability

In my previous post I talked a little about the notion that big data alone cannot solve many of our problems.  I would like to give a more concrete example of this by discussing a paper published at CIKM 2008: “Retrievability: An Evaluation Measure for Higher Order Information Access Tasks” by Azzopardi and Vinay.  In large [...]

Large Data versus Limited Applicability

Large data can be extremely effective, but how widely applicable is it, really?
A week or two ago the blogosphere was abuzz with discussion about the Unreasonable Effectiveness of Data position paper by Googlers A. Halevy, P. Norvig, and F. Pereira.  I had my own commentary, but some great discussion came when Peter Norvig jumped in to [...]