Wired Article on Bing

I just came across a Wired article today on a new search push from Microsoft, which will supposedly be named Bing. It touches on some of the issues that we were discussing in yesterday’s comment thread, in particular: 

People thought online e-mail was just fine and more or less converged on the same specific set of features — until Google came along and gave people gigs of disk space, organized e-mails by conversations and let people send big attachments. Soon Yahoo and Microsoft were forced to follow. So too with search. Google appears to have created the staple recipe, but there is a clear hunger for something more. Unfortunately people may not know what that something extra is until they see it — and that’s something not even Google has been able to figure out. So what do we know about what web searchers want? Weitz gave Wired.com a look at some of what Microsoft found when it when “back to the data” — namely Live.com search results — in a bid to make a qualitative leap in search performance. The data shows rampant clicking by many on the back button, while others get desperate enough to look to the second page of results. And when that doesn’t work, the users try again, coming up with slightly different terms. That’s about half of the searches. Only a quarter of searches return a good result — meaning an answer to a question (think a stock price), a satisfying search engine result or a happy ad click.

While this is a good start, it’s still not clear to me that the interpretations of the measurements are correct.  Just because someone doesn’t click something, does that mean the search was a failure?  Just because someone did click something, does it mean that the search was a success?  It is not to difficult to come up with reasonable and abundant, counterexamples.  And it’s still not clear how to differentiate task failure from process failure.

On a slightly different note, I found the following excerpt from the article particularly interesting: Continue reading…

Machine Learning and Search: Action or Reaction?

I have a question that has been bothering me, kicking around in my head, for at least half a decade now.  And I can’t seem to come to any solid conclusion on it. I suppose it can’t hurt to throw it out here onto the web, and see if one of my 3 readers [...]

Week Links, Volume 1

This was a particularly busy week, and I did not get a chance to post many thoughts.  Instead, I’ll do a quick roundup of articles that I enjoyed reading this past week+.

First, a tongue-in-cheek post from Nick Carr entitled For Whom the Google Tolls:

It’s amazing that, before Google came along, any of us was able to survive beyond childhood. At the company’s Zeitgeist conference in London yesterday, cofounder Larry Page warned that privacy-protecting restrictions on Google’s ability to store personal data were hindering the company from tracking the spread of diseases and hence increasing the risk of mankind’s extinction. The less data Google is allowed to store, said Page, the “more likely we all are to die.”

Continue reading…

Opposite Day

Two pieces of recent news have my head spinning. Both are instances of technology companies acting in exactly the opposite manner from their ideals (and public statements). The first is Microsoft announcement of an open-source version of BigTable: 

Instead of creating a proprietary copy of these pieces of infrastructure, Powerset decided instead to turn to Hadoop, [...]

Personal Branding and Search Results Integrity

Google is an information retrieval company that prides itself on the purity of its results.  It does not allow the integrity of its ranked list ordering to be tampered with by sponsored results. It also has claimed for years that it does not engage in hand-coding (aka hand-crafting or hard-coding) of results. Everything that it returns in the non-sponsored, organic list is purely algorithmic, or at least only indirectly influenced by the hand of humans (e.g. relevance assessors and quality raters).  The order in which a result is ranked will not be — as far as I’ve always understood Google’s position — hand-picked.

So I was much surprised recently to learn about a new initiative from Google that allows you to create a Google profile for yourself, which Google places into the 10th slot in the organic results when someone searches for your name!  From the official Google blog:

To give you greater control over what people find when they search for your name, we’ve begun to show Google profile results at the bottom of U.S. name-query search pages…Don’t have a Google profile? Just search for [me] and follow the instructions at the top of the page to create one. In just a few minutes, you can create a public profile that represents you and that appears when people search for your name on Google.

How is this not hard-coding of results?  Continue reading…

Do You Rotate Your Search Engine Usage?

It is good practice to rotate the mattress on your bed, to prevent lopsided wear-and-tear from shortening its useful life.  The same thing applies to car tires; they need rotating.  Smart travelers know to rotate the airlines from which they purchase tickets, as the accumulation over time of per-ticket better prices often outweighs the rewards or miles than comes from a single airline’s loyalty perks. Even the internet itself works by allowing packets of information to dynamically rotate across different routes, based on traffic congestion, rather than tying up a full end-to-end circuit.

So why wouldn’t you rotate your search engine usage? Continue reading…

The Tyranny of Simplicity

One of my ongoing frustrations with modern, consumer-facing information organization and retrieval systems is the way in which functionality is often sacrificed in the name of simplicity.

Full functionality under the rubric of simplicity is a laudable goal, and I would agree that this is where we all eventually want to end up in the information systems, interfaces and algorithms that we are designing.  Simplicity without full functionality, but with alternative complex interfaces for an advanced user to specify greater functionality is a satisfactory stepping stone along the path to this goal.  But simplicity with obstructed or stunted functionality, with no possibility for the user to improve that functionality, is too often what we end up with.

Case in point: Apple’s iTunes/iPod. Continue reading…

World Pinhole Photography Day

While the focus of this blog is the retrieval of existing information, from music to images to videos to text, every once it a while it is nice to create new information as well.  In that spirit I decided to participate in World Pinhole Photography Day, which is today, Sunday April 26, 2009.  While [...]

Google Similar Images: Only 20%?!

A few days ago, Google launched “similar image search” functionality.  From TechCrunch:

A new 20% time Google project has just launched called Google Similar Images. It’s pretty self-explanatory — when you search for an image and find one close to what you’re looking for, Google can now find ones that it believes to be [...]

Researcher on Fire

Over the past month and a half, computer science researcher and UQAM Professor Daniel Lemire has been on fire.  He’s written a series of blog posts on what it means to do research and be involved with a research community.  I’ve thoroughly enjoyed the whole series, and want to pass along pointers to his last 8 posts:

Continue reading…