Information Retrieval Gupf

They Won People Over By A Logical Argument

Posted on June 10, 2011 by jeremy

Via @glinden, I enjoyed this article on why GDrive (an early cloud document/file store) was never launched by Google:

At the time [2008], Google was about to launch a project it had been developing for more than a year, a free cloud-based storage service called GDrive. But Sundar [Pichai] had concluded that it was an artifact of the style of computing that Google was about to usher out the door. He went to Bradley Horowitz, the executive in charge of the project, and said, “I don’t think we need GDrive anymore.” Horowitz asked why not. “Files are so 1990,” said Pichai. “I don’t think we need files anymore.”

Pichai apparently went on to explain in more detail why files are no longer needed. It has to do with the notion that, in the cloud you just have data and information. Organizing that information into files is not necessary, especially when you can just start editing that information directly in Google Docs. I’m going to ignore for a moment the “don’t be evil” ramifications of data portability and lock-in that comes through the dissolution of explicit files — how am I supposed to export my data into the Microsoft Cloud Word or into Open Office or into VisiWord whatever else I’d like to use, if files do not exist? Instead, I’m going to focus on how this decision was arrived at:

When Pichai first proposed this concept to Google’s top executives at a GPS—no files!—the reaction was, he says, “skeptical.” [Linus] Upson had another characterization: “It was a withering assault.” But eventually they won people over by a logical argument—that it could be done, that it was the cloudlike thing to do, that it was the Google thing to do. That was the end of GDrive: shuttered as a relic of antiquated thinking even before Google released it. The engineers working on it went to the Chrome team.

This is what I find absolutely fascinating. Here is a company that A/B tests everything in a heavily data driven manner, down which of 41 shades of blue the link anchortext should be. So you would think that such a momentous decision about killing the whole GDrive project would be data driven. It was not. I quote again:

But eventually they won people over by a logical argument—that it could be done, that it was the cloudlike thing to do, that it was the Google thing to do.

Here is an instance where an important decision potentially very large service was made not by the data, but by a HiPPO, the highest-paid person in the room. Continue reading →

Posted in General, Information Retrieval Foundations | Leave a comment

Workshop on Collaborative Information Retrieval (CIR 2011)

Posted on May 16, 2011 by jeremy

Workshop on Collaborative Information Retrieval (CIR 2011)
CIKM’2011, Glasgow, UK, October 28th.
http://cir2011.fxpal.com/

Organizers
———-

– Gene Golovchinsky, FX Palo Alto Laboratory, Inc, USA.
– Jeremy Pickens, Catalyst Repository Systems, USA.
– Meredith Ringel Morris, Microsoft Research, USA.
– Juan M. Fernández-Luna, University of Granada, Spain.
– Juan F. Huete, University of Granada, Spain.
– Julio C. Rodríguez-Cano, University of Informatics Science, Cuba.

Introduction and Goal
———————

This is the third workshop we are organizing on the topic of collaborative information retrieval. The first workshop, held in conjunction with JCDL 2008, focused on broad topics and sought to establish a vocabulary for discussion about collaborative information seeking, to identify work practices and disciplines that might benefit from collaborative information seeking, and to establish a community of researchers with related interests. The second workshop, held in conjunction with CSCW 2010, built on the previous results, and focused on issues of communication and awareness in support of collaborative information seeking.

Our goal in this third workshop is to focus on algorithmic and other software issues related to information seeking in a collaborative setting. Continue reading →

Posted in Collaborative Information Seeking, Exploratory Search, Information Retrieval Foundations | Leave a comment

+1 is Explicit, but is not Relevance Feedback

Posted on April 7, 2011 by jeremy

A week or so ago, Google introduced it’s answer to the Facebook “Like”. It is called “+1”. Here is a quote from the official announcement:

The +1 button is shorthand for “this is pretty cool” or “you should check this out.” Click +1 to publicly give something your stamp of approval. Your +1’s can help friends, contacts, and others on the web find the best stuff when they search.

A discussion then ensued on Twitter about whether Google had finally introduced explicit relevance feedback to its system. For a long time, the user has been able to give implicit signals of preference to the search engine algorithm in the form of click-throughs. And conventional wisdom has held that users are too lazy or to disinterested to interact with a web search engine in any explicit manner beyond typing 2.7 keywords into the one-line search box. But now Google has introduced the +1. Does this mean that explicit relevance feedback is finally here?

My answer is no. And it is important to understand why.

First of all, Continue reading →

Posted in Information Retrieval Foundations | 5 Comments

Close the Loop!

Posted on December 17, 2010 by jeremy

The web is abuzz this week with talk of the Google Books Ngram Viewer. It’s a great tool, and leads to some very interesting exploration and trend visualization. So does this tool fly in the face of my rant from a few days ago, about how Google’s improvements to search are all automated improvements, with no opportunity for the user to learn and grow?

The first problem is that because (most of) those 550 changes happen while the users are still “asleep”, users don’t actually notice them. Google doesn’t exactly go out of its way to make many of its search improvements visible to the user, and so it’s often difficult to tell whether or not something has happened. As a user, I personally don’t like that approach, because a change that is invisible or purposely hidden is a change that I as a user have no control over, and am not able to change back or alter further. And as I argued in an earlier post, the way to creating passionate search users is not to give them luxury seats without waking them up. Instead, the way to create passionate search users is to give them search tools that give users a path in which they can grow, improve, and get better at searching. Do users get better at flying, or at seeing and comprehending an information landscape from 30,000 feet, if they’ve got luxury chairs? Arguably not. If anything, the luxury chairs make it harder for users to sit upright, to have a “leaning forward”, engaged experience. Users are less inclined, pun intended, to be active participants in the experience. All the decision are being made for them.

On the surface, it would appear that users now have such a tool, a way to explore, compare, and learn. A way to lean forward at the edge of their seats (rather than lean back, asleep, in luxury chairs) and make search work for them, rather than the other way around. However, the problem is that this tool is still not connected back into an actual search. One can visualize trends, but one cannot actually find books that best exemplify these trends.

Take for example the Ngram Viewer query [science, religion]. Continue reading →

Posted in Exploratory Search | 2 Comments

Search Algorithms versus Asimov’s First Law of Robotics

Posted on December 16, 2010 by jeremy

Search Engine Land has a short article on bias versus brands. The issue at hand is whether Google Instant has a brand bias. Google says it does not:

Singhal explains that when someone types in T, mathematically “most people typing T will go to Target. That’s the probability model. If you add R to it (“Tr”), most people are looking for a translation system. It’s actually just pure mathematical modeling.” It is just math, he says, not a bias.

Oh come on, now! What kind of explanation is that? There is no such thing as “just math”. There is always a conscious decision to use math in a particular way.

Let’s take as an example the classic information retrieval ranking function: tf * idf. Continue reading →

Posted in Information Retrieval Foundations | 4 Comments

The Search User Wants a Story

Posted on June 25, 2010 by jeremy

I fired up reddit this morning and was completely flabbergasted by one of the top posts. The title of the post was “This is Why I Use Google, Not Bing”. And it linked straight to this screenshot (which I reproduce here, in case the target disappears at some point):

This blew my mind, not only that an alphageek would prefer the (Google) interface on the left to the (Bing) interface on the right, but that the redditor alphageek community would so heavily upvote it. The way I see it, this speaks directly to the issues of simplicity as storytelling vs. sparsity that I’ve talked about from time to time. The interface on the left is anything but sparse. In fact, it is extremely busy and filled with images, a tool belt of various verticals (news, video images), query modification tools such as timelines and recency sorting, and query reformulation tools such as narrowly related searches (top middle) and broadly related searches (lower left).

In short, everything about it is “non-Googly” Continue reading →

Posted in Exploratory Search, Information Retrieval Foundations | 19 Comments

More on Simplicity and the Paradox of Choice

Posted on June 23, 2010 by jeremy

I came across an interesting blogpost today, entitled “The Paradox of Choice is Not Robust“. To requote their quote:

Benjamin Scheibehenne, a psychologist at the University of Basel, was thinking along these lines when he decided (with Peter Todd and, later, Rainer Greifeneder) to design a range of experiments to figure out when choice demotivates, and when it does not.

But a curious thing happened almost immediately. They began by trying to replicate some classic experiments – such as the jam study, and a similar one with luxury chocolates. They couldn’t find any sign of the “choice is bad” effect. Neither the original Lepper-Iyengar experiments nor the new study appears to be at fault: the results are just different and we don’t know why.

After designing 10 different experiments in which participants were asked to make a choice, and finding very little evidence that variety caused any problems, Scheibehenne and his colleagues tried to assemble all the studies, published and unpublished, of the effect.

The average of all these studies suggests that offering lots of extra choices seems to make no important difference either way.

I’ll let that speak for itself, and will note only a few of my related blog posts from a year+ ago: Google Search Options and the Paradox of Choice and Ranked Lists and the Paradox of Choice.

Posted in General, Information Retrieval Foundations | 4 Comments

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

They Won People Over By A Logical Argument

Workshop on Collaborative Information Retrieval (CIR 2011)

+1 is Explicit, but is not Relevance Feedback

Top Posts of 2010

Close the Loop!

Search Algorithms versus Asimov’s First Law of Robotics

The Search User Wants a Story

More on Simplicity and the Paradox of Choice

Recent Posts

Recent Comments

Archives