Information Retrieval Foundations – Information Retrieval Gupf

A Button Without The Treat

jeremy — Mon, 13 Jun 2011 12:54:01 +0000

A few months ago I wrote a post entitled +1 is Explicit, but is not Relevance Feedback. I am often personally concerned that, with many of the posts I write, I am being pedantic. However, last week TechCrunch came to the same conclusion: +1 Is Like A Button You Push For A Treat — Without The Treat. Some highlights:

I understand the concept behind the +1 Button — it’s a smart one. You get people to click it and it improves the page’s search ranking for logged-in Google users with social connections (and eventually maybe all results). At least I think that’s how it works. But I have a hard time believing that all of you actually clicking on the button really get why you’re doing it. Don’t get me wrong, it’s great that you’re clicking on it! I am too on some of our stories. But I can’t help but get the feeling that it’s a bit like a cruel experiment we’re running. We put up a button, you click on it because it’s there, expecting you’ll get a treat. But there is no treat.

As I was saying a few months ago, +1 allows for explicit signaling. But that signaling just isn’t a relevance feedback-type of signaling. The person doing the clicking doesn’t actually get anything “fed back” from that action to their ongoing information seeking task. TechCrunch continues:

If the +1 Button is serving me up better results, I’m just not seeing it. And yes, I know the button push also populates your Google profile with a feed of our shared stories. But let’s be honest, no one is looking at those. We’re definitely not seeing any noticeable bump in pageviews coming from Google as a result of the button. Maybe that will slowly change over time, but I’m not convinced. The rate at which people are clicking on the button appears to be dropping each day. And soon it may be just like the *gulp* Buzz button.

This echoes what I said in my previous post:

In traditional feedback, an individual user marks a subset of documents as relevant and non-relevant, and then the system updates his or her ranked list results, immediately, so as to increase the recall (and sometimes also the precision) of documents not yet seen. There is a reason it is called feedback: the loop is closed. Just like when you hold a microphone too close to a speaker and start to get audio feedback. That’s only possible because the output of one input gets fed immediately back into that same input. Not into someone else’s input.

TechCrunch concludes:

Google needs to figure this out quickly. When you push a button, you need to get a treat. People will click for a while out of pure novelty and curiousness. But that only lasts so long. Without anything noticeable happening (like a share on Twitter, or a comment on Facebook), people will just ignore the button altogether. All over the web.

What has me scratching my head is why so many web search engines — and this +1 is just one example of the larger, industry-wise attitude — are so opposed to explicit relevance feedback. Yeah, I know the story: Altavista or Lycos tried some version of an explicit relevance feedback +1 button for a few months back in 1996 and it was found to not work well, because users were unwilling or too lazy to put any effort into using the tool. Well, with +1 and +1-type functionality, we’ve seen that users are indeed willing to put the effort into using the tool — at least until they find out that the tool isn’t really doing anything for them. So why not close the loop now — quickly! — before users build a strong association in their mind that +1-type buttons do nothing for the user, especially in the moment. An association that takes another 15 years to correct. Give the users a treat when they press the button. How? Close the loop of relevance feedback. This is an opportunity, not a criticism.

They Won People Over By A Logical Argument

jeremy — Fri, 10 Jun 2011 10:41:37 +0000

Via @glinden, I enjoyed this article on why GDrive (an early cloud document/file store) was never launched by Google:

At the time [2008], Google was about to launch a project it had been developing for more than a year, a free cloud-based storage service called GDrive. But Sundar [Pichai] had concluded that it was an artifact of the style of computing that Google was about to usher out the door. He went to Bradley Horowitz, the executive in charge of the project, and said, “I don’t think we need GDrive anymore.” Horowitz asked why not. “Files are so 1990,” said Pichai. “I don’t think we need files anymore.”

Pichai apparently went on to explain in more detail why files are no longer needed. It has to do with the notion that, in the cloud you just have data and information. Organizing that information into files is not necessary, especially when you can just start editing that information directly in Google Docs. I’m going to ignore for a moment the “don’t be evil” ramifications of data portability and lock-in that comes through the dissolution of explicit files — how am I supposed to export my data into the Microsoft Cloud Word or into Open Office or into VisiWord whatever else I’d like to use, if files do not exist? Instead, I’m going to focus on how this decision was arrived at:

When Pichai first proposed this concept to Google’s top executives at a GPS—no files!—the reaction was, he says, “skeptical.” [Linus] Upson had another characterization: “It was a withering assault.” But eventually they won people over by a logical argument—that it could be done, that it was the cloudlike thing to do, that it was the Google thing to do. That was the end of GDrive: shuttered as a relic of antiquated thinking even before Google released it. The engineers working on it went to the Chrome team.

This is what I find absolutely fascinating. Here is a company that A/B tests everything in a heavily data driven manner, down which of 41 shades of blue the link anchortext should be. So you would think that such a momentous decision about killing the whole GDrive project would be data driven. It was not. I quote again:

But eventually they won people over by a logical argument—that it could be done, that it was the cloudlike thing to do, that it was the Google thing to do.

Here is an instance where an important decision potentially very large service was made not by the data, but by a HiPPO, the highest-paid person in the room. Granted, that HiPPO did not just come out and declare his or her omnipotent will. Reason and logical argumentation were still needed. But reason and logical argumentation were all that was needed. Nobody had to go out and “prove the idea with code”, as Silicon Valley loves to say. Code was written for GDrive, but the code itself did not provide the proof of its own non-release. And the all-powerful Big Data didn’t even begin to enter into the equation. What provided the proof was a core logical argument, coupled with a strong vision for the future (“it was the cloudlike thing to do”) with an ounce of emotional appeal (“that it was the Google thing to do”).

This is a very refreshing story and I am heartened and encouraged by it. The reason this is exciting is that much of the research that I work on, such as iterative relevance feedback and explicit collaboration, is work that does not have an immediate outlet in the consumer search world. It might take years before the average user is ready to engage with some of these tools and techniques, rather than the typical five-month lifecycle of your average prove-with-code, throw-it-against-the-wall-see-if-it-sticks data-driven feature release. Furthermore, it takes much longer to develop some of this research, as it is more risky and exploratory, and the market might not be ready for it for a long time. At the same time, however, if one waits to start developing such technologies until the market is actually ready, then it is already too late.

For example, the common wisdom for over a decade was that users were too lazy or too unwilling to provide explicit relevance judgments on the information or documents with which they are interacting. So none of these tools were developed. All of a sudden, the Facebook “Like” button took off, and pretty soon the “+1” button was added. In complete contradiction and defiance to ten years of “prove it with the data” arguments about users being unwilling to explicitly mark the relevance of their information.

The way around this problem is to be willing to let a HiPPO make a decision — based on logical argument rather than on log data or usage data — thereby clearing the organization to move forward with that decision. Start working on tools for explicit judgment years ago, and you will be ready with a fantastic solutions once the marketplace catches up. Are all such HiPPO decisions going to be correct? Of course not. But will fewer opportunities be missed, because you are unwilling to use logical argumentation to carve out a bold new vision for the future? Yes.

Don’t get me wrong; data-driven decision making is very useful. But it is useful for incremental improvements. If you want to take big leaps forward, such as the leap Google wanted to take in 2006 with its vision of the cloud, that requires a HiPPO being able to win people over — or being won over — by logical argument.

Workshop on Collaborative Information Retrieval (CIR 2011)

jeremy — Mon, 16 May 2011 18:58:52 +0000

Workshop on Collaborative Information Retrieval (CIR 2011)
CIKM’2011, Glasgow, UK, October 28th.
http://cir2011.fxpal.com/

Organizers
———-

– Gene Golovchinsky, FX Palo Alto Laboratory, Inc, USA.
– Jeremy Pickens, Catalyst Repository Systems, USA.
– Meredith Ringel Morris, Microsoft Research, USA.
– Juan M. Fernández-Luna, University of Granada, Spain.
– Juan F. Huete, University of Granada, Spain.
– Julio C. Rodríguez-Cano, University of Informatics Science, Cuba.

Introduction and Goal
———————

This is the third workshop we are organizing on the topic of collaborative information retrieval. The first workshop, held in conjunction with JCDL 2008, focused on broad topics and sought to establish a vocabulary for discussion about collaborative information seeking, to identify work practices and disciplines that might benefit from collaborative information seeking, and to establish a community of researchers with related interests. The second workshop, held in conjunction with CSCW 2010, built on the previous results, and focused on issues of communication and awareness in support of collaborative information seeking.

Our goal in this third workshop is to focus on algorithmic and other software issues related to information seeking in a collaborative setting. We would like to explore a variety of algorithms for mediating collaboration, and also to examine how different user interface elements can be used to support associated activity. Algorithmic aspects will include the coordination of input from multiple people, fusion and distribution of search results, and modifications to ranking algorithms based on group-specific information. Interface aspects will include support for awareness of individual and of group activity, role-specific interfaces, support for communication among collaborators, and support for transparency of search algorithms to foster a better understanding of the search space. It is also important to consider the effect that the starting context (e.g., IM chat, discussions in a social network, transitions from single-user to collaborative search, etc.) has on the algorithms and on the UI.

For more information, please see the Call For Papers at http://cir2011.fxpal.com/, which also includes a Demo session during which participants are encouraged to demonstrate interactive collaborative information seeking systems.

Call for Papers
—————

Support for explicit collaboration is an essential part of many information seeking activities. Explicit collaboration differs from recommendation systems and collaborative filtering in that the people engaged in information seeking have an explicitly shared information need. Hence, rather than inferring similarities of intent, the system is free to mediate the sharing of knowledge and division of labor. In the last few years, several research groups have pursued various issues related to collaboration during search, including support for awareness, algorithmic mediation, conceptual and software frameworks for collaboration, and collaboration through a range of different devices.

Explicit collaboration implies a certain emphasis on interaction. The system has to not only communicate search results to the user, but also mediate communication and data sharing among its users. There are new algorithms that need to be invented that use inputs from multiple people to produce search results, and new evaluation metrics need to be invented that reflect the collaborative and interactive nature of the task. Finally, we need to integrate the expertise of library and information science researchers and practitioners by revisiting real-world information seeking situations with an eye for shared information needs and explicit collaborative search.

We are looking for several kinds of submissions for the workshop.

– Position/work in progress papers, four to six pages in length in the standard ACM format, the describe work related to collaborative information seeking. Papers will be reviewed and a few will be selected for presentation; the rest will be invited to submit a poster instead. The submission date for papers is June 29th.

– Posters will present late-breaking or just-starting work in the area. Poster submissions should be two to four pages in the standard ACM format. The workshop schedule will include time to view and discuss posters. The submission date for posters is June 29th.

– Demos will show working systems that support collaborative search or other information seeking activities. Each accepted demo will include a plenary presentation slot of 15-20 minutes, during which significant aspects of the system can be described and demonstrated. We will also set aside time for a session (analogous to the poster session) during which participants could interact with the demo presenters and their systems. Those interested in participating in the demo track should submit a two to four page paper in the standard ACM format that will describe key aspects of the system and how it supports various aspects of collaborative search. These papers are also due on June 29th. In addition, because system building is sometimes unpredictable in the amount of time involved, to help us plan the schedule, we would like to receive an interim progress report from those intending to participate in the demo track some time in early September. Such a report should consist of a series of screenshots or a video screen-cast (or even a link to a working online system!). We’ll settle on an exact date once we see how many participants we have in this track. Demo acceptances are contingent on having a viable system available in time for the workshop; if participants are unable to get a system ready in time for the workshop, we will reclassify the submission as a Poster.

Important Dates
—————

– Submission date: June 29, 2011
– Notification of acceptance: July 29, 2011
– Revised papers due: August 12, 2011
– Conference dates: 24-28, 2011
– Workshop Date: October 28, 2011

Submission Procedure
——————–

Submissions to the workshop will be handled through the EasyChair site. Please log into EasyChair from the following URL http://www.easychair.org/conferences/?conf=cir20110 to submit papers. Please note the trailing 0 on the URL!

For further questions
——————–

Please contact the organizers with any questions about the workshop. You can follow the workshop on Twitter with the #cir2011 hashtag.

+1 is Explicit, but is not Relevance Feedback

jeremy — Thu, 07 Apr 2011 12:19:28 +0000

A week or so ago, Google introduced it’s answer to the Facebook “Like”. It is called “+1”. Here is a quote from the official announcement:

The +1 button is shorthand for “this is pretty cool” or “you should check this out.” Click +1 to publicly give something your stamp of approval. Your +1’s can help friends, contacts, and others on the web find the best stuff when they search.

A discussion then ensued on Twitter about whether Google had finally introduced explicit relevance feedback to its system. For a long time, the user has been able to give implicit signals of preference to the search engine algorithm in the form of click-throughs. And conventional wisdom has held that users are too lazy or to disinterested to interact with a web search engine in any explicit manner beyond typing 2.7 keywords into the one-line search box. But now Google has introduced the +1. Does this mean that explicit relevance feedback is finally here?

My answer is no. And it is important to understand why.

First of all, +1 in its current state does not cause any changes in the search algorithm to happen at all. From a Mashable interview:

[Interviewer] Will the number of +1s affect search rankings?
[Google] Prosser says no, but adds that it’s something Google is “very interested” in incorporating in some form at some point.”

Immediately it becomes clear that because +1 does not affect the search rankings, we cannot call it explicit feedback. It might be explicit rather than implicit. But it is not feedback.

However, my point runs deeper than that. Let’s suppose at some point in the future the +1 actually did start to affect the overall ranking algorithm (the system as a whole), as Google wants to eventually make happen. Even if it did that, it must be noted that +1 would still not affect your current search. I.e. Google is pitching this is helping you find things that your friends have already found interesting. Therefore, when your friends do a +1, they aren’t actually the ones receiving the benefit. You are. Their searches aren’t actually improving. Yours are. Again from the official announcement:

Sometimes it’s easier to find exactly what you’re looking for when someone you know already found it. Get recommendations for the things that interest you, right when you want them, in your search results. The next time you’re trying to remember that bed and breakfast your buddy was raving about, or find a great charity to support, a +1 could help you out. Just make sure you’re signed in to your Google Account. In order to +1 things, you first need a public Google profile. This helps people see who recommended that tasty recipe or great campsite. When you create a profile, it’s visible to anyone and connections with your email address can easily find it. Your +1’s are stored in a new tab on your Google profile. You can show your +1’s tab to the world, or keep it private and just use it to personally manage the ever-expanding record of things you love around the web.

One Twitter commenter gave the analogy to voting, and said that because users are explicitly allowed to put a label on something (“vote”) for it, that makes it explicit feedback. I would like to expand on that voting analogy. Yes, the +1 is explicit, like a vote. But because we’ve already established that the vote doesn’t “feed back” to your own current information need, but instead affects other users of the system for their future information needs, it would be like a citizen of Country A casting a vote for the leader of Country B, and a citizen of Country B casting a vote for the leader of Country C, and so on. Sure, both citizens are voting. But each is voting for someone else’s leader. So when the leader of country B does something, it affects the citizen of country B, rather than the citizen (of country A) who actually voted for that leader. The analogy here is that leaders of countries are the search algorithms, and the vote is the +1. Sure, you can vote all you want. But if your vote doesn’t actually go toward your own leader, then your vote doesn’t actually affect what happens to you.

That is not what traditional IR literature means when it discusses relevance feedback. In traditional feedback, an individual user marks a subset of documents as relevant and non-relevant, and then the system updates his or her ranked list results, immediately, so as to increase the recall (and sometimes also the precision) of documents not yet seen. There is a reason it is called feedback: the loop is closed. Just like when you hold a microphone too close to a speaker and start to get audio feedback. That’s only possible because the output of one input gets fed immediately back into that same input. Not into someone else’s input.

+1 offers no such closed loop. Instead, a user explicitly expresses a preference for certain pieces of information. That preference gets (or will someday soon get) used to update the search engine’s algorithm. That updated algorithm will then alter/affect/change the results that some other user, with a perhaps related but also perhaps different information need, would have seen. The search engine results are indeed improved as a result of the explicit +1. But not for the original user, and not on that user’s existing information need. And especially not immediately. No results are reranked. No new or changed query suggestions are given in the moment that the user casts his or her +1 vote. In short, there is no feedback.

There is a word for user preference information that is fed into a search algorithm for the purpose of making that algorithm better: It is called “training data”. Search engine companies use training data to improve their overall algorithm, and make sure that future users of a system get better results than past users did. Currently, much of that training data in the web search world is implicit data: Click-throughs. But just making that data explicit rather than implicit (through the use of +1’s) does not change the fundamental nature of the data; it’s still training data for the search algorithm, not feedback to and from the user. With +1 training data, the user loop is open, not closed. No algorithmic updates are flowing from the user back to the user for assistance with and improvements on the user’s current information need.

In summary, my argument boils down to this: +1, while explicit, does not offer a closed loop for a user’s open, unsatisfied information need. Therefore, the purpose of +1 as Google envisions it is as training data, and not as relevance feedback. Explicit training data, yes. And it most likely carries a much stronger signal:noise ratio than implicit training data (click-throughs). But it is (or will be once Google incorporates the signal) still naught but training data.

Why it matters

Ok, so +1 is explicit training rather than explicit feedback. Who cares? Aren’t we just arguing over terminology, splitting hairs over my favorite versus your favorite word? No. it is important to understand the difference between training data and relevance feedback because it helps understand what a search engine is doing for you (the user) and why. It boils down to a question of user information need type: Do you have a precision-oriented information need (e.g. home page finding, recipe finding, address finding, etc.) or do you have a recall-oriented information need (are you seeking to understand everything that was published in the news media about the housing bubble, prior to its collapse, so as to paint a picture of who might have known more than they’re letting on)? If you have a precision-oriented need, then you don’t need explicit relevance feedback. You are more than likely fine with a system that trains itself on both the explicit and implicit actions of other users (either worldwide or even just your friends — still is training data) so as to make your search for [coffee shop chicago] a little better. Remember:

The +1 button is shorthand for “this is pretty cool” or “you should check this out.” Click +1 to publicly give something your stamp of approval. Your +1’s can help friends, contacts, and others on the web find the best stuff when they search.

But if your information need is deeper, and you don’t just need to fine a small handful of nearby coffee shops, but instead are attempting to make sense of a larger issue, then explicit data used for system parameter estimation under a social Learning to Rank regimen is not going to be sufficient. You want to be able to instruct the system in the moment about the pieces of information that you are finding, and have it correct itself in that same moment, for that exact task. You want there to be a closed loop between your actions and the machine’s actions, with each immediately influencing the other. In short, you want relevance feedback. Some of that relevance feedback might be implicit, some might be explicit. But in both cases, that information is being “fed back” to the task, and then new information is returning to you, in real time. This lets you do deeper on your current task than you could if you were simply waiting for a hundred other friends to click +1 on all the newspaper articles that you are looking for. Understanding this difference is key to understanding how you as a user approach a system, and what you do with it. It is not productive if the user thinks that the system is doing one thing, and it’s actually doing another.

Now, in no wise does this mean +1 is not useful. In fact, it’s so useful (for particular types of information needs) that folks such as Barry Smyth have been doing similar social things for 6-7 years now (see Heystaks). My goal was not to comment on the efficacy of +1, and in fact I think it is rather nice that a forward step toward more user interaction in the form of explicit judgments is finally becoming important to Google. Rather, it was to seek clarity and clear delineation on what that efficacy is actually trying to do, and for whom, and by what mechanism, and why. The explicit judgment is explicit training data. It is not explicit relevance feedback.

Search Algorithms versus Asimov’s First Law of Robotics

jeremy — Thu, 16 Dec 2010 13:13:45 +0000

Search Engine Land has a short article on bias versus brands. The issue at hand is whether Google Instant has a brand bias. Google says it does not:

Singhal explains that when someone types in T, mathematically “most people typing T will go to Target. That’s the probability model. If you add R to it (“Tr”), most people are looking for a translation system. It’s actually just pure mathematical modeling.” It is just math, he says, not a bias.

Oh come on, now! What kind of explanation is that? There is no such thing as “just math”. There is always a conscious decision to use math in a particular way.

Let’s take as an example the classic information retrieval ranking function: tf * idf. Researchers have long known that it is important to rank documents by their “it’s just math” query term frequency with in a document. However, it is just as, if not more, important to correct for that internal term frequency by using global frequency statistics, such as (inverse) document frequency. The reason is that if you have a query such as [the table] and you do not correct for the collection-wide ubiquity of the word [the], your document rankings will be dominated by probabilities of [the] within a document. IDF values are used to correct this bias, and bring the term [table] to much higher prominence. Experimental results almost always show that tf * idf beats ranking by tf alone. (Aside: Even language models have a idf-like, global probability smoothing factor to correct for tf alone.)

Thus, to make a ranking algorithm truly useful, the mathematics have to be designed to account for the “it’s just math” probabilities. Without such correction, the algorithms are biased away from relevance. So to claim that brands dominate because “it’s just math” masks the deeper issue that existing bias within Google Instant isn’t being corrected and is propagated to the user.

At the risk of getting a little too geeky, I am reminded of the Asimov laws of robotics. The first and primary law is: “A robot may not injure a human being or, through inaction, allow a human being to come to harm.” Note that there are two edges of the “no harm” sword: No harm through commission, and no harm through omission. Both are required.

I think that there is a analogy to brand bias within search algorithms. In the search engine domain, the robot is the algorithm. Just like it is not enough for a robot to avoid committing acts of harm against a human, it is also not enough for the algorithm to throw up its hands, say “it’s just math” and therefore I haven’t actively, explicitly committed any bias in my rankings. No, the algorithm has to simultaneously be aware of the biases that arise through through inaction, or omission, as well.

Click-through probabilities are the tf, the component of the ranking algorithm that ensures true, unbiased, no-acts-of-commission probability. Where is idf, to balance out that math, to ensure that no acts of bias omission slip through? Without it there is still bias. “It’s just math” is not a proper defense.

Agree? Disagree?

Miffed and Confused

jeremy — Wed, 15 Dec 2010 14:16:52 +0000

Have been on a six month blogging hiatus, and wouldn’t you know it.. it took another fun Google article to pull me back. It is a recent FastCompany piece, entitled Google to Zuckerberg, Bing: We Still Innovate. The premise of the article is that Facebook has recently partnered with Bing to deliver social search and cites Google’s slowed rate of innovation as one of the primary motivators for this move. This has left Google, one source says, “miffed and confused as to how Zuckerberg figured they weren’t innovating”.

Perhaps I could be of assistance.

The article cites a number of reasons why Google is miffed and confused about Facebook’s stance: “The company has more people working on search than ever before”. It has a “list of 100 [search] projects”. And “last year, the team launched about 550 changes to its search engine, and in September, unveiled Instant, one of the largest overhauls to its engine ever.” The article continues:

Every year, he says, Google runs thousands of experiments. These experiments include just about everything you could imagine: changing the color of a link or button; improving Arabic semantics; building real-time search; or creating Google Instant, the results-as-you-type feature. As Singhal says, it’s simply a function of having more resources: His team is able to test hypotheses faster. “We couldn’t do these things five years back even if we wanted to,” he explains. “We didn’t have enough engineers. But by having a bigger team today, we have new ideas, new people, and the capacity to execute on those ideas.”

So why would Facebook say that Google isn’t innovating, especially when 550 changes have been made? I think the next quote, by Google Fellow Amit Singhal, illustrates the problem perfectly:

Fast Company broached the subject recently with Google’s Amit Singhal, who oversees Google’s ranking and algorithm team. “The main reason why Google is where it is today is that we have been able to make huge changes to our search–and are able to do it while running this big search engine,” he says “We used to compare that to changing an engine on a jet while it’s flying. Over the years, we’ve not only mastered changing the engine while flying, but have been able to change the seats without the users noticing. That’s the beauty of how we innovate. You’ve suddenly given everyone first class seats, and they didn’t even wake up.”

The first problem is that because (most of) those 550 changes happen while the users are still “asleep”, users don’t actually notice them. Google doesn’t exactly go out of its way to make many of its search improvements visible to the user, and so it’s often difficult to tell whether or not something has happened. As a user, I personally don’t like that approach, because a change that is invisible or purposely hidden is a change that I as a user have no control over, and am not able to change back or alter further. And as I argued in an earlier post, the way to creating passionate search users is not to give them luxury seats without waking them up. Instead, the way to create passionate search users is to give them search tools that give users a path in which they can grow, improve, and get better at searching. Do users get better at flying, or at seeing and comprehending an information landscape from 30,000 feet, if they’ve got luxury chairs? Arguably not. If anything, the luxury chairs make it harder for users to sit upright, to have a “leaning forward”, engaged experience. Users are less inclined, pun intended, to be active participants in the experience. All the decision are being made for them.

But let’s set the user perception issue aside for a moment. Even if the user doesn’t notice those 550 improvements at a conscious level, that doesn’t change the fact that Google has innovated 550 times over the past year, does it? Of course not. The innovations have still happened, they exist. But what innovations are they? Well, as Singhal’s airplane analogy suggests, they are improvements that make the existing experience faster and a little more comfortable. Cushier seats. A better shade of link blue. More legroom. 5 pixel margins rather than 2 pixel margins. A faster plane with a more powerful engine. Google Instant.

But at the end of the day, it’s still a plane. And the view of the information landscape is still from 30,000 feet, even if that view is an Instant view. What if instead of getting the high level overview of relevant information, the user wants to dig down into a narrow, deep, richer vein? What if the user wants to mine for precious information ores, rather than fly over the mountain five miles overhead? What if the user wants the information retrieval engine to act as a excavator, deep earth drill, or other such heavy mining tool? Do any of those 550 changes help make the airplane more like an underground drill? Or do all 550 changes simply make a sleeker, faster airplane?

I’ve talked about this issue of evolutionary vs. long term thinking (improving the airplane, versus changing it into a deep earth drill) in the past. I’ve also asked for search to change radically, to help me in much harder information seeking tasks such as finding hidden cafes in Prague. But I think this question of innovation is illustrated perfectly in the following bit from the article:

But what about social search? Facebook teamed with Microsoft, not Google. Does Google have any partnerships planned for social search? “I’m glad you asked, because we launched social search about two years or so back,” says Singhal.

When Google launched social search 13.5 months ago (October 26, 2009 to December 9, 2010 is not two years), what they launched was this:

A lot of people write about New York, so if I do a search for [new york] on Google, my best friend’s New York blog probably isn’t going to show up on the first page of my results. Probably what I’ll find are some well-known and official sites. We’ve taken steps to improve the relevance of our search results with personalization, but today’s launch takes that one step further. With Social Search, Google finds relevant public content from your friends and contacts and highlights it for you at the bottom of your search results. When I do a simple query for [new york], Google Social Search includes my friend’s blog on the results page under the heading “Results from people in your social circle for New York.” I can also filter my results to see only content from my social circle by clicking “Show options” on the results page and clicking “Social.”

Having worked in the area of Collaborative Search (see also this) for the past four years, an area that is not unrelated to Social Search, I have long learned to make the following distinction: There is a difference between process and data. Data-based social search is the idea of having content generated by your social circle show up in your results, e.g. your friend’s NY blog. Process-based social search is the idea of using your friends’ patterns of information seeking behavior to influence the ranking of content from outside of your social circle.

Another way of expressing this distinction is “search of social data” versus “social search of data”.

Showing your friend’s blog when you search for New York is search of social data. It’s interesting, but I wouldn’t necessarily characterize it as an innovative, game changing leap. Social search of data, on the other hand, is a much more radical approach, and much more of a leap. It affects how one finds every piece of information on the entire web, not just your friends’ blogs. I started seeing the concept of social search of data appear 4 to 5 years ago, with the work of Barry Smyth, and Microsoft publicly started publishing work in this area around three years ago. So it does make sense to me that Facebook would partner with a company that has more of a track record in social search of data, rather than search of social data.

Don’t get me wrong; I am not saying Google doesn’t innovate. It does. I am simply trying to explain to those who were miffed and confused why someone would say that. A better link shade of blue makes the search process more comfortable; it plushes up the airplane seats. And entire engine rebuild so as to allow instant results makes things faster. But it doesn’t fundamentally alter the manner in which information is found. It doesn’t utilize social behaviors to rerank the entire web. It doesn’t let me dig deeper, and find hidden cafes in Prague. It just lets me not find those same hidden Prague cafes…faster and more comfortably.

If the engine rebuild around Google Instant was one of the “largest overhauls to its engine ever” as the article says above, and it only quantitatively changes the speed at which results come back rather than qualitatively changing the manner in which information seeking happens, then it is not unreasonable to seek a different type of innovation. At some point search has to become more than precision@3, more than a fast, comfortable ride. At some point search has to become a real tool for exploration and growth, for comparison and learning. At some point, the definition of innovation has to move from step to leap. Google has the engineering chops to make this happen. But do they have the culture?

The Search User Wants a Story

jeremy — Fri, 25 Jun 2010 18:25:50 +0000

I fired up reddit this morning and was completely flabbergasted by one of the top posts. The title of the post was “This is Why I Use Google, Not Bing”. And it linked straight to this screenshot (which I reproduce here, in case the target disappears at some point):

This blew my mind, not only that an alphageek would prefer the (Google) interface on the left to the (Bing) interface on the right, but that the redditor alphageek community would so heavily upvote it. The way I see it, this speaks directly to the issues of simplicity as storytelling vs. sparsity that I’ve talked about from time to time. The interface on the left is anything but sparse. In fact, it is extremely busy and filled with images, a tool belt of various verticals (news, video images), query modification tools such as timelines and recency sorting, and query reformulation tools such as narrowly related searches (top middle) and broadly related searches (lower left).

In short, everything about it is “non-Googly”, i.e. non-sparse and non-clean. Ironically, the Bing results for this particular query — which is held up as the example of what not to do — is the cleaner one.

So why is it that thousands of Google-loving redditors prefer the interface that is, well, more Bing-like? Could it be that the user is finally starting to understand that simplicity is not the same thing as sparsity? That what matters is the story? The Google results in this case tell a really good story. They give a concise overview of the latest matches and scores. They link directly to highlights. They give a concise overview of upcoming matches and the time at which each occurs. And they acknowledge that when you search for “World Cup”, you’re not just trying to navigate to a single page. Instead, you are “exploratorily” looking for as much information as you can about what is happening at the event as a whole, and perhaps even with football (soccer) as a whole. This is not just a “one box” answer. This is a whole “cluttered” set of rich information and interaction options.

That’s the story. And if it takes a non-sparse (complex or cluttered) interface to tell that story, then so be it. The story is more important than the strict adherence to sparsity. Which is something that I’ve been hammering on about for at least the past half decade now. It is just encouraging to see users finally start to acknowledge it.

Now, all we need to do is let the redditor community know that even though Google beat Bing on this one particular query, overall Bing has been pushing more of this story-appropriate, non-sparse, information rich (“cluttered”) interaction in their results. What I wish users did more of is constantly rotate between the various engines, to know for themselves which queries work on which engines, and what each of the various engines are capable of. Because the irony here is that the redditor that which “This is Why I Use Google, Not Bing” has chosen and interface that is much more Bing-like, and less traditionally “Googly”.

See also my related post, about two Googlers (Norvig and an anonymous employee) and their comments about Bing at the Semantic Technology conference in June 2009.

Update: In the couple of minutes between when I saw the reddit link and when I finished writing this post, the Google vs. Bing image went from 4th on the reddit home page (with ~500 upvotes) to 2nd (with ~750 upvotes). Clearly this has touched a nerve. It’s very interesting to see this reaction, especially because the preferred interface, again, is so traditionally non-Googly and cluttered.

More on Simplicity and the Paradox of Choice

jeremy — Wed, 23 Jun 2010 16:12:48 +0000

I came across an interesting blogpost today, entitled “The Paradox of Choice is Not Robust“. To requote their quote:

Benjamin Scheibehenne, a psychologist at the University of Basel, was thinking along these lines when he decided (with Peter Todd and, later, Rainer Greifeneder) to design a range of experiments to figure out when choice demotivates, and when it does not.

But a curious thing happened almost immediately. They began by trying to replicate some classic experiments – such as the jam study, and a similar one with luxury chocolates. They couldn’t find any sign of the “choice is bad” effect. Neither the original Lepper-Iyengar experiments nor the new study appears to be at fault: the results are just different and we don’t know why.

After designing 10 different experiments in which participants were asked to make a choice, and finding very little evidence that variety caused any problems, Scheibehenne and his colleagues tried to assemble all the studies, published and unpublished, of the effect.

The average of all these studies suggests that offering lots of extra choices seems to make no important difference either way.

I’ll let that speak for itself, and will note only a few of my related blog posts from a year+ ago: Google Search Options and the Paradox of Choice and Ranked Lists and the Paradox of Choice.

Simplicity: Sparsity or Storytelling?

jeremy — Thu, 10 Jun 2010 17:39:00 +0000

A tweet by @akumar prompted me to punch up this quick blogpost:

as with all controversial issues, there’s a positive in google trying bing/image – that they’re not afraid to learn from competition

What Amit is referring to is the recent addition of gorgeous photographic images as search page background. See for example this writeup: http://blogs.abcnews.com/theworldnewser/2010/06/google-vs-bing-copycat-picture-on-prominent-page.html

He is of course correct; Google is learning from the competition. But there is another issue at play here, one that I don’t want to overlook because I feel it is very important. It is the issue of simplicity. What is simplicity? How is it defined? How is it measured? Conversely, what is complexity? What is clutter?

For over a decade now, Google has essentially defined simplicity as sparsity. Sparse backgrounds, lots of negative space, sparse color schemes, sparse auxiliary information (e.g. query term suggestions on the SERP page have only started appearing in the last year or two, despite the fact that such features existed 15 years ago in search engines of old such as Infoseek and Altavista). The reason given was that people didn’t like clutter, that people like simplicity. And in Google’s definition, simplicity equals sparsity.

I agree. People do like simplicity. I don’t question the veracity of that general sentiment. What has always bothered me, though, is the equivocation of simplicity with sparsity. I think a much better definition of simplicity is not the amount of information or colors or negative space on a page, but the story that a design, interface, interaction, or algorithm tells. Something with a lot of colors and links and words can still be simple…if it tells a clear story! Conversely, something with fewer colors and links (sparser) can be more complex, if the story that it communicates is muddy and not as purposely focused.

This brings us to the Bing background image. In my opinion, the even though the inclusion of a background image is less sparse and more “cluttered” (more colors, more shapes, more textures), it actually assists in the telling of a clearer story. Why? Because it more cleanly separates foreground and background, subject and frame. It provides compositional balance to the page. The white query input box on white background (10+ years of Google design) is sparser, but the story that it tells is less clear because foreground and background are not as cleanly separated. A white query input box on a richly colored and textured background tells a clearer, simpler story because the background image frames and separates the foreground query input box. Furthermore, because you can now distinguish background and foreground, you can more clearly see that the query input box lies near the pleasing “rule of thirds” line, which aids further in the overall storytelling.

In short, I applaud this move by Google, just as I applaud it from Bing. I never liked the white-on-white, because sparsity is not the same thing as simplicity. Simplicity arises through good storytelling, not through minimalism. No A/B testing will tell you this, though. It’s a definitional issue that must be defined before you start your A/B tests. Google has learned from the competition, as @akumar says. But I hope that the lesson Google has learned is not just that users like pretty pictures. I hope the lesson is that, when it comes to simplicity, there is a difference between sparsity and storytelling.

See also my posts: The Tyranny of Simplicity, The Tyranny of Simplicity, Redux, and The Craft of Storytelling. I also found this older discussion on Google’s Lively to be a fascinating read. In my understanding, the issue of “necessary complexity” that the author of that post hammers home about is related to the issue of storytelling. Too much sparsity (of interaction in Lively’s case) leads to an inability to tell a clear story. Simplicity is storytelling, not sparsity.

Seeing Stars

jeremy — Wed, 28 Apr 2010 20:59:52 +0000

There is an interesting blogpost on the Official Google blog today, about seeing stars:

We’ve long believed that personalization makes search more relevant and fun. For nearly five years, we’ve been tailoring results with personalized search. Today we’re announcing a new feature in search that makes it easier for you to mark and rediscover your favorite web content — stars. With stars, you can simply click the star marker on any search result or map and the next time you perform a search, that item will appear in a special list right at the top of your results when relevant. That means if you star the official websites for your favorite football teams, you might see those results right at the top of your next search for [nfl].

So it sounds to me like this is a sort of bookmarking. What it not as obviously, however, is what this sentence means:“the next time you perform a search, that item will appear in a special list right at the top of your results when relevant”. Does that mean the next time you perform the same search (e.g. [nfl]) that starred item will appear at the top? Or is it more dynamic than that? I.e., if I happen to perform the search [new england patriots], and that same link that I’d previously starred after executing the [nfl] query happens to be ranked in the top k, will it again appear at the top of my list? (And if so, what is the cutoff/threshold for k?) Similarly, if Google’s ranking of my original [nfl] query changes, due to shifting PageRank calculations, changes in freshness, or any of the hundreds++ of other signals that go into the ranking algorithm, and my particular starred web page no longer appears in the top k because it is no longer relevant to the [nfl] query using the signal vector from the current state of index, will the starred item not appear? After all, Google says that the starred item will only appear if it is relevant, and if it is no longer relevant to the [nfl] query, as determined by Google’s relevance algorithm, then it won’t appear? Even though I had previously starred it with respect to that exact query?

The post continues:

In our testing, we learned that people really liked the idea of marking a website for future reference, but they didn’t like changing the order of Google’s organic search results. With stars, we’ve created a lightweight and flexible way for people to mark and rediscover web content.

Now I am thoroughly confused. People didn’t like changing the order of Google’s organic search results, but at the same time, they claim earlier in the post that “For nearly five years, we’ve been tailoring results with personalized search.” What does it mean to personalize search results, if not to change the order of Google’s organic search results? (Quoting the earlier post:

With the launch of Personalized Search, you can use that search history you’ve been building to get better results. You probably won’t notice much difference at first, but as your search history grows, your personalized results will gradually improve.

So if users didn’t like changing the order of the organic search results, does this mean that Google has turned off (or will be turning off) personalization completely for all signed-in users? Or does personalization co-exist with explicit starring/bookmarks? If so, how exactly does that work? Will Google change the order (personalize) your organic results using only the signals of query history and implicit relevance (i.e. clickthrough), but not the signal of explicit starring? That’s even more confusing…the amount of mental jazz involved is a bit overwhelming. Sure, the interface jazz is kept to a minimum, but at the expense of making the user’s mental model of what the search engine is actually doing for him or her even more muddled.

Perhaps the best way to sort out this confusion is to dive in headfirst and start playing around with the system, seeing what it actually does and when. But I personally have a difficult time generating the gumption to use a feature for which I have an unclear mental model, an unclear understanding of what it is trying to do for me, how it might change, when it might or might not magically appear. Especially when some of my actions affect the state of the system and others do not.

One thing I do like about this feature, however, is that it uses out-of-band displays to show different types of information. Rather than trying to mix global/non-personalized results, implicit personalized results, and starred results, it lets you know via a separate channel whether there is any information that you have previously starred. This is an IR design principle that I would like to see more of — separate goals in separate channels. Examples of different IR goals include navigation, re-finding, discovery, exploration, etc. Rather than trying to mix results from all of these goals into a single channel (a single ranked list) it is quite useful to separate each goal from the other. This new Google interface does that. What exactly the goal attached to that separate channel is, again, unclear. But the existence of a separate channel is an interesting and exciting approach, one that I hope to see more of.