Breadth Destroys Depth

A few days ago I posted a question about why modern web retrieval systems offer no explicit relevance feedback mechanisms.  I wonder if it has anything to do with the following attitude, explained by one of my favorite bloggers, Nick Carr:

The problem with the Web, as I see it, is that it imposes, with its imperialistic iron fist, the “ecstatic surfing” behavior on everything and to the exclusion of other modes of experience (not just for how we listen to music, but for how we interact with all media once they’ve been digitized). In the pre-Web world, we not only enjoyed the thrill of the overnight sensation – the 45 that became the center of your waking hours for a week only to be replaced by the new song – but also the deeper thrill of the favorite band in whose work we deeply immersed ourselves, often following its progression over many records and many years. Continue reading…

Semantic Technology Search Panel

On Wednesday I attended the Executive Round Table on Semantic Search, at the 2009 Semantic Technology Conference.  Researchers from Ask, Hakia, Yahoo, Google, Powerset/Bing, and True Knowledge were on the panel.  In the next few days I hope to give a longer write-up of the session over on the FXPAL blog.  In the meantime I wanted to quickly point out one nugget, and one related Tweet.

The panel covered a large number of topics.  But it was inevitable that the moderator would turn to the Google panelist (Peter Norvig) and ask him what he thought about Bing.  There has been too much buzz lately for that question not to be asked.  I was pleasantly surprised by his answer.  I’m not going to risk quoting him, only paraphrasing. And if I misrepresent anything, any mistakes are mine, and not intentional.

[Paraphrase] Norvig’s first answer to the Bing question was to say that he likes the idea of innovation in the user interface.  He thinks that there is a lot of room for more such innovation, and for a lot of different reasons.  Historically, there has been too much emphasis on getting the ranking right, at the expense of all else.  Of course (he added) a quality ranking is something that you absolutely must have.  But for too long it has been the only thing that has been worked on, and that needs to change.  He thinks Bing has made some good steps, and that there are a lot more that can be made as well.

Wow!  This is not the Google that I’ve known for a decade, the Google that has actively shunned most forms of interactivity, feedback, and exploration other than spelling correction.  Continue reading…

Exploratory Food Search

I came across an interesting article today in the New Scientist on the topic of mass-scale food annotation.  The idea is that we can instrument our food, so that we know much more about its origin and manner of production:

WHERE does your food come from? A few years ago, most consumers were satisfied with a sticker showing the country of origin. But concerns about fair trade and the environment, as well as food safety, are now driving a wave of projects aimed at tracking food from farm to shopping basket. Though price is still the main factor determining the food that people buy, many are demanding to know more about its source. This is partly due to a series of recent food safety scandals, from major outbreaks of salmonella and E. coli to melamine showing up in baby formula and pet food. “The public want to know where their food and other products come from, how they are made, and whether they contain any ‘unhealthful’ contaminants,” says Dara O’Rourke, an environmental policy expert at the University of California, Berkeley. Ethical and environmental concerns figure prominently, too. In the US, for example, “a small but rapidly growing percentage of the population – perhaps 8 to 10 per cent – are deeply interested in these issues,” says food policy expert Marion Nestle of New York University. “Interest in where food comes from is part of a growing social movement.”

Most manufacturers already use barcodes or RFID chips to track their products. But with the help of cheap cellphone and internet access it is becoming possible to collate data from remote locations around the world and make it available to the people who are actually going to eat the food. In many cases manufacturers are alive to the notion that transparency about the source of their food is good for business. Sime Darby, a large palm oil supplier in Indonesia and Malaysia, is working with FoodReg, a firm based in Barcelona, Spain, that develops food-tracking software. The idea is to develop a system to prove to customers that its crops are not grown on land recently occupied by tropical rainforest. In remote regions where farmers don’t have access to computers, they can use cellphones to record onto FoodReg’s online database the time and place the crop was harvested. Tracking systems like this should also make it easy to calculate the distance that goods travel to reach stores, allowing consumers to estimate the greenhouse gas emissions racked up by the transport of their food. “The calculation of food miles and carbon footprint could be the killer application for traceability,” says Heiner Lehr of FoodReg. “The technology is there. If a big retailer puts itself behind this, it could happen very fast.”

Projects like this are interesting to me because I can imagine myself in the future making decisions about how and what I buy, based on the information that I am able to obtain about my various choices.  In fact, it would be nice to be able to walk into the grocery store with the information seeking intent of finding a good source of protein (whether chicken or beef, or maybe even just beans) for the evening’s meal, and come out of the store with a product that not only fit my budget, but that I felt good about buying.  But in order to make this information useful to consumers, there has to be some sort of search or information retrieval layer built on top of the data.

This is where the “I’m feeling lucky” model of simply trying to give the consumer an “answer” breaks down.  Continue reading…

Compare Google Yahoo Bing

I would like to point to a post worth reading, over at Blogoscoped, about personal, blind side-by-side comparisons of the various contending search engines.  I have seen studies like this for years, both on the web and in published, academic papers (see my earlier post).  And this current, informal study continues to confirm what all the other studies have shown: When you strip away branding information, there is no clear winner from among the top-contending search engines.  Maybe years ago, Google was leaps and bounds better than all the others.  Today, it does not appear to be the case.  

The reason I point out this informal study is not only to continue to raise awareness of the essential parity among the engines, but to point out something interesting that the author of the post (Philipp Lenssen) says: Continue reading…

Wired Article on Bing

I just came across a Wired article today on a new search push from Microsoft, which will supposedly be named Bing. It touches on some of the issues that we were discussing in yesterday’s comment thread, in particular: 

People thought online e-mail was just fine and more or less converged on the same specific set of features — until Google came along and gave people gigs of disk space, organized e-mails by conversations and let people send big attachments. Soon Yahoo and Microsoft were forced to follow. So too with search. Google appears to have created the staple recipe, but there is a clear hunger for something more. Unfortunately people may not know what that something extra is until they see it — and that’s something not even Google has been able to figure out. So what do we know about what web searchers want? Weitz gave Wired.com a look at some of what Microsoft found when it when “back to the data” — namely Live.com search results — in a bid to make a qualitative leap in search performance. The data shows rampant clicking by many on the back button, while others get desperate enough to look to the second page of results. And when that doesn’t work, the users try again, coming up with slightly different terms. That’s about half of the searches. Only a quarter of searches return a good result — meaning an answer to a question (think a stock price), a satisfying search engine result or a happy ad click.

While this is a good start, it’s still not clear to me that the interpretations of the measurements are correct.  Just because someone doesn’t click something, does that mean the search was a failure?  Just because someone did click something, does it mean that the search was a success?  It is not to difficult to come up with reasonable and abundant, counterexamples.  And it’s still not clear how to differentiate task failure from process failure.

On a slightly different note, I found the following excerpt from the article particularly interesting: Continue reading…

Google Search Options and the Paradox of Choice

Google finally acquiesces, and starts exposing more advanced, user-controllable search result refactorization options.  See here, here, and here:

But as people get more sophisticated at search they are coming to us to solve more complex problems. To stay on top of this, we have spent a lot of time looking at how we can better understand the wide range of information that’s on the web and quickly connect people to just the nuggets they need at that moment. We want to help our users find more useful information, and do more useful things with it. Our first announcement today is a new set of features that we call Search Options, which are a collection of tools that let you slice and dice your results and generate different views to find what you need faster and easier. Search Options helps solve a problem that can be vexing: what query should I ask? Let’s say you are looking for forum discussions about a specific product, but are most interested in ones that have taken place more recently. That’s not an easy query to formulate, but with Search Options you can search for the product’s name, apply the option to filter out anything but forum sites, and then apply an option to only see results from the past week.

I’m pleased to see that it is finally happening.  For years I’ve clamored about how frustrating it is that Google not only hasn’t given users these sorts of options, but has actively campaigned against such functionality: They have often said that exposing advanced tools is “too complex” for users and that it would clutter the famously clean Google interface.  Perhaps the long-held belief that simplicity trumps all other considerations is finally being let go, with the understanding that functionality is sometimes more important than bare and minimal interfaces.  This is a good thing.

Willingness to expose these tools helps topple the myth that is often perpetuated, about how HCIR interfaces offer the user too much choice and therefore do more harm than good Continue reading…

Universal Search is not Exploratory Search

In a recent response article, Danny Sullivan takes Forbes CEO Spanfeller to task on the whole Google vs. The Newspapers issue.  There are a lot of things I agree with Danny about, and an equal number of things that I disagree with.  But I feel compelled to propagate one nugget from Spanfeller:

Spanfeller: Search is not really all that great at the moment, a comment repeated time and again by much more astute folks then me. This is especially true when looking for high-quality professionally created content. This is not to say that user-generated content or ecommerce options or product specs should not be returned in search results, simply that there is clearly a better way to showcase the different paths an end user might be pursuing. The idea that everyone is forced into trying to “game” the system so that they get their “fair” (or sometimes not so fair) share is testament to how terribly wrong this entire process has become.

This excites me because I see in this statement an acknowledgment and realization that Exploratory Search and HCIR (“showcasing”) is necessary.  Sullivan, however, completely misses the point: Continue reading…

More and Faster versus Smarter and More Effective

Last month, in reaction to the “Unreasonable Effectiveness of Data” paper that made the rounds, Stephen Few from the Business Intelligence community wrote an interesting post:

The notion that “we need more data” seems to have always served as a fundamental assumption and driver of the data warehousing and business intelligence industries. It is true that a missing piece of information can at times make the difference between a good or bad decision, but there is another truth that we must take more seriously today: most poor decisions are caused by lack of understanding, not lack of data. The way that data warehousing and business intelligence resources are typically allocated fails to reflect this fact. The more and faster emphasis of these efforts must shift to smarter and more effective. Although current efforts to build bigger and faster data repositories and better production reporting systems should continue, they should take a back seat to efforts to increase the data sense-making skills of workers and to improve the tools that support these skills.

This is a point that I wholely subscribe to, and an aspect of which I encountered the other day when attempting to use web search engines to satisfy my “hidden cafes in prague” information need.  Continue reading…

“Improving Findability” Falls Short of the Mark

Via Tim O’Reilly on Twitter, I came across this article by Vanessa Fox on how government can improve the findability of their web pages, and thereby allow citizens to become better informed and government to be more transparent.  Fox writes:

Continue reading…

Music Explaura: Exploration and Discovery in Action

Music Information Retrieval continues to be an excellent place to play around with the intersection of search, recommendation, user-guided exploration, and explanatory (transparent) algorithms.

First, check out the announcement of Music Explaura from Stephen Green at Sun Research.  Stephen writes:

Continue reading…