Explanatory Search – Information Retrieval Gupf

Seeing Stars

jeremy — Wed, 28 Apr 2010 20:59:52 +0000

There is an interesting blogpost on the Official Google blog today, about seeing stars:

We’ve long believed that personalization makes search more relevant and fun. For nearly five years, we’ve been tailoring results with personalized search. Today we’re announcing a new feature in search that makes it easier for you to mark and rediscover your favorite web content — stars. With stars, you can simply click the star marker on any search result or map and the next time you perform a search, that item will appear in a special list right at the top of your results when relevant. That means if you star the official websites for your favorite football teams, you might see those results right at the top of your next search for [nfl].

So it sounds to me like this is a sort of bookmarking. What it not as obviously, however, is what this sentence means:“the next time you perform a search, that item will appear in a special list right at the top of your results when relevant”. Does that mean the next time you perform the same search (e.g. [nfl]) that starred item will appear at the top? Or is it more dynamic than that? I.e., if I happen to perform the search [new england patriots], and that same link that I’d previously starred after executing the [nfl] query happens to be ranked in the top k, will it again appear at the top of my list? (And if so, what is the cutoff/threshold for k?) Similarly, if Google’s ranking of my original [nfl] query changes, due to shifting PageRank calculations, changes in freshness, or any of the hundreds++ of other signals that go into the ranking algorithm, and my particular starred web page no longer appears in the top k because it is no longer relevant to the [nfl] query using the signal vector from the current state of index, will the starred item not appear? After all, Google says that the starred item will only appear if it is relevant, and if it is no longer relevant to the [nfl] query, as determined by Google’s relevance algorithm, then it won’t appear? Even though I had previously starred it with respect to that exact query?

The post continues:

In our testing, we learned that people really liked the idea of marking a website for future reference, but they didn’t like changing the order of Google’s organic search results. With stars, we’ve created a lightweight and flexible way for people to mark and rediscover web content.

Now I am thoroughly confused. People didn’t like changing the order of Google’s organic search results, but at the same time, they claim earlier in the post that “For nearly five years, we’ve been tailoring results with personalized search.” What does it mean to personalize search results, if not to change the order of Google’s organic search results? (Quoting the earlier post:

With the launch of Personalized Search, you can use that search history you’ve been building to get better results. You probably won’t notice much difference at first, but as your search history grows, your personalized results will gradually improve.

So if users didn’t like changing the order of the organic search results, does this mean that Google has turned off (or will be turning off) personalization completely for all signed-in users? Or does personalization co-exist with explicit starring/bookmarks? If so, how exactly does that work? Will Google change the order (personalize) your organic results using only the signals of query history and implicit relevance (i.e. clickthrough), but not the signal of explicit starring? That’s even more confusing…the amount of mental jazz involved is a bit overwhelming. Sure, the interface jazz is kept to a minimum, but at the expense of making the user’s mental model of what the search engine is actually doing for him or her even more muddled.

Perhaps the best way to sort out this confusion is to dive in headfirst and start playing around with the system, seeing what it actually does and when. But I personally have a difficult time generating the gumption to use a feature for which I have an unclear mental model, an unclear understanding of what it is trying to do for me, how it might change, when it might or might not magically appear. Especially when some of my actions affect the state of the system and others do not.

One thing I do like about this feature, however, is that it uses out-of-band displays to show different types of information. Rather than trying to mix global/non-personalized results, implicit personalized results, and starred results, it lets you know via a separate channel whether there is any information that you have previously starred. This is an IR design principle that I would like to see more of — separate goals in separate channels. Examples of different IR goals include navigation, re-finding, discovery, exploration, etc. Rather than trying to mix results from all of these goals into a single channel (a single ranked list) it is quite useful to separate each goal from the other. This new Google interface does that. What exactly the goal attached to that separate channel is, again, unclear. But the existence of a separate channel is an interesting and exciting approach, one that I hope to see more of.

Kasparov and Good Interaction Design

jeremy — Mon, 25 Jan 2010 22:59:14 +0000

A NYT books article about Kasparov and chess, and the relationship between humans, machines, and decision processes is making the Twitter rounds today. I don’t have time at the moment to write a long comment about it, but I do want to point out that it supports a position that I’ve been taking on this blog for some time now:

This experiment goes unmentioned by Russkin-Gutman, a major omission since it relates so closely to his subject. Even more notable was how the advanced chess experiment continued. In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. Normally, “anti-cheating” algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less “intelligent” than the playing programs they detect.)

Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.

The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

This result seems awfully similar to some of the other results I’ve reported on in the past. For example, see this paper by Amatriain:

Data is always important, but what struck me in the writeup was his discovery that the biggest advances came not from accumulation of massive amount of data, log files, clicks, etc. Rather, while dozens and dozens of researchers around the world were struggling to reach that coveted 10% improvement by eking out every last drop of value from large data-only methods, Amatriain comparatively easily blew past that ceiling and hit 14%. How? He simply asked users to denoise their existing data by rerating a few items. In short, Amatriain resorted to HCIR:

See also Tessa Lau’s post about how good interaction design trumps smart algorithms:

I come to the field of HCI via a background in AI, having learned the hard way that good interaction design trumps smart algorithms in the quest to deploy software that has an impact on millions of users. Currently a researcher at IBM’s Almaden Research Center, I lead a team that is exploring new ways of capturing and sharing knowledge about how people interact with the web. We conduct HCI research in designing and developing new interaction paradigms for end-user programming.

See also two of my previous posts, More and Faster versus Smarter and More Effective and A Bird in the Hand.

The theme that I see is that, while big data approaches do work well, what works even better is a small amount of user interaction. With big data methods (even ones that incorporate human interaction in the form of massive log data) all you can do is make inferences about what is good and what is not good. The more historical user data you have, the more correct your inference about the current scenario is likely to be. But none of it is as correct as receiving explicit feedback from the user, and turning a probability into a certainty.

And that’s where I see good interaction design coming into play. By turning a probability into a certainty, your back end algorithms can stop wasting their CPU cycles doing all the inferential heavy lifting about what the user is actually trying to say or do, and can start using their CPU cycles to explore a wider range of consequences of that informational certainty.

More Information Is Positive

jeremy — Mon, 09 Nov 2009 12:48:48 +0000

Via Greg Linden, I came across this interesting quote from Eric Schmidt about the obligation to help newspapers succeed:

Finally, Eric claimed Google has a moral duty to help newspapers succeed:

Google sees itself as trying to make the world a better place. And our values are that more information is positive — transparency. And the historic role of the press was to provide transparency, from Watergate on and so forth. So we really do have a moral responsibility to help solve this problem.

Well-funded, targeted professionally managed investigative journalism is a necessary precondition in my view to a functioning democracy … That’s what we worry about … There [must be] enough revenue that … the newspaper [can] fulfill its mission.

This is great that Google feels this professional responsibility. And I wholeheartedly agree with Schmidt that “more information is positive”. My only question is: Why don’t we see “more information” and transparency when it comes to other media companies, aka search engines? Newspapers engage in investigative journalism in order to bring stories from industry and politics to the citizens. Search engines engage in algorithmic retrieval in order to bring stories from the newspapers (and other sources) to the citizens. The historical role of the press has been to provide transparency. So also is the modern role of the retrieval engine to provide transparency. And just as a good reporter has to cite sources to make their stories credible, so should a search algorithm provide explanatory interfaces, algorithms, and information to make their results credible.

Shouldn’t there be an expectation of as much information and transparency from our search interfaces and algorithms as we have from our press? It is no secret that I think there should be. It is a goal that I strive for in my own research; I can’t say that it’s not difficult, but it is worth striving for.

The Craft of Storytelling

jeremy — Thu, 05 Nov 2009 11:50:39 +0000

I’ve been playing around with some old TREC data over the past few days and completely by chance I came across this document. I find it interesting because storytelling is a good metaphor for what we as researchers do when we construct interactive information seeking systems. The document is short enough that I think I can reproduce it here in its entirety without getting into intellectual property trouble. I hope.

DOCNO: LA070590-0123

DOCID: 243123

July 5, 1990, Thursday, Home Edition

Calendar; Part F; Page 1; Column 1; Calendar Desk

57 words

QUOTABLE

“Networks are run by people whose weakest suit is that they can’t understand the importance of the craft of storytelling, which is what film and television are all about. . . . They can do statistical things, but they can’t quantify storytelling and put it into a computer.”

Writer-producer Roy Huggins, in Television & Families magazine

Wikipedia’s take on Roy Huggins.

More and Faster versus Smarter and More Effective

jeremy — Thu, 30 Apr 2009 11:13:14 +0000

Last month, in reaction to the “Unreasonable Effectiveness of Data” paper that made the rounds, Stephen Few from the Business Intelligence community wrote an interesting post:

The notion that “we need more data” seems to have always served as a fundamental assumption and driver of the data warehousing and business intelligence industries. It is true that a missing piece of information can at times make the difference between a good or bad decision, but there is another truth that we must take more seriously today: most poor decisions are caused by lack of understanding, not lack of data. The way that data warehousing and business intelligence resources are typically allocated fails to reflect this fact. The more and faster emphasis of these efforts must shift to smarter and more effective. Although current efforts to build bigger and faster data repositories and better production reporting systems should continue, they should take a back seat to efforts to increase the data sense-making skills of workers and to improve the tools that support these skills.

This is a point that I wholely subscribe to, and an aspect of which I encountered the other day when attempting to use web search engines to satisfy my “hidden cafes in prague” information need. It didn’t matter that big data pointed the way to all the popular cafes. And it didn’t matter that the search engine came back with results to each of my [hidden prague cafe] and [prague passage cafe] queries in a blazingly fast 0.7 seconds. The answers weren’t correct. I spent orders of magnitude more time and effort — 20 minutes in fact — trying come up with the right way of instructing the search engine as to my true information need in the first place. In the end I never did find the right query to help me find the U Raka cafe, short of using the name of the cafe itself — which was the whole point.

So I agree; what is needed is not more data and faster answers, but better tools to help us comb and make sense of that data, and ask the right questions in the first place. A one-line text input box is not enough.

See also my commentary on Improving Findability with respect to Government 2.0 Search. The same issue exists there, as well. To rephrase: Although current efforts to build bigger and faster data [Government data] repositories…should continue, they should take a back seat to efforts to increase the data sense-making skills of [citizens] and to improve the [search engine] tools that support these skills.

Few concludes (emphasis mine):

Researchers, especially those who work in the cognitive sciences, have learned a great deal about the way people process information and make decisions, including the flaws in the process that often trip us up. Proper training based on these insights is needed to make us better analysts; good tools are needed to help us work around analytical limitations that are built right into our brains. It is toward these ends that the bulk of our data warehousing and business intelligence investments should be directed. Is this where you’re focusing your efforts? Is this even on your radar?

Music Explaura: Exploration and Discovery in Action

jeremy — Tue, 07 Apr 2009 20:27:13 +0000

Music Information Retrieval continues to be an excellent place to play around with the intersection of search, recommendation, user-guided exploration, and explanatory (transparent) algorithms.

First, check out the announcement of Music Explaura from Stephen Green at Sun Research. Stephen writes:

On the left of the artist page, you see the list of similar artists generated by the AURA recommenders. This list of artists is generated using a technique that’s quite a bit different than you’re probably used to. Rather than relying on the wisdom of the crowds via a technique like collaborative filtering, the AURA system computes the similarity between artists by computing the similarity between their textual auras.

Second, See Paul Lamere’s writeup, “Music Discovery is a Conversation, Not a Dictatorship“. Paul writes:

The Music Explaura gives us a hint of what music discovery will be like in the future. Instead of a world where a music vendor gives you a static list of recommended artists we’ll live in a world where the recommender can tell you why it is recommending an item, and you can respond by steering the recommendations away from things you don’t like and toward the things that you do like. Music discovery will no longer be a dictatorship, it will be a two-way conversation.

This is really cool stuff, and I hope the ideas behind steerable recommendations will start to work their way out into all types of search and recommendation, from Amazon-style purchase recommendations to standard web search itself.

Is the Ad-Sponsored Web Search Market a Conversation?

jeremy — Wed, 01 Apr 2009 13:15:25 +0000

It has now officially been ten years since Christopher Locke, Doc Searls, and David Weinberger wrote the Cluetrain Manifesto, rekindling and reminding us of the centuries-old notion that markets are conversations between people, buyers and sellers. The following are a few of the Manifesto’s points that resonate with me:

The Internet is enabling conversations among human beings that were simply not possible in the era of mass media.

Human communities are based on discourse—on human speech about human concerns. The community of discourse is the market.

Markets want to talk to companies. Sadly, the part of the company a networked market wants to talk to is usually hidden behind a smokescreen of hucksterism, of language that rings false—and often is. Markets do not want to talk to flacks and hucksters. They want to participate in the conversations going on behind the corporate firewall. De-cloaking, getting personal: We are those markets. We want to talk to you. We want access to your corporate information, to your plans and strategies, your best thinking, your genuine knowledge. We will not settle for the 4-color brochure, for web sites chock-a-block with eye candy but lacking any substance. We’re also the workers who make your companies go. We want to talk to customers directly in our own voices, not in platitudes written into a script.

The authors elaborate on this point a little more in their book:

The first markets were filled with people, not abstractions or statistical aggregates; they were the places where supply met demand with a firm handshake. Buyers and sellers looked each other in the eye, met, and connected. The first markets were places for exchange, where people came to buy what others had to sell — and to talk. The first markets were filled with talk. Some of it was about goods and products. Some of it was news, opinion, and gossip. Little of it mattered to everyone; all of it engaged someone. There were often conversations about the work of hands: “Feel this knife. See how it fits your palm.” “The cotton in this shirt, where did it come from?” “Taste this apple. We won’t have them next week. If you like it you should take some today.” Some of these conversations ended in a sale, but don’t let that fool you. The sale was merely the exclamation mark at the end of the sentence. Market leaders were men and women whose hands were worn by the work they did….For thousands of years, we knew exactly what markets were: conversations between people who sought out others who shared the same interests. Buyers had as much to say as sellers. They spoke directly to each other without the filter of media, the artifice of positioning statements, the arrogance of advertising, or the shading of public relations.

The Internet was supposed to herald the dawn of a new era, a return to those days in which buyer and seller could look each other in the eye and, with a firm handshake, directly exchange goods and services for money. How well as the Internet lived up to this promise? In particular, how well has it lived up in the marketplace of Information Seeking? Are information seekers (search engine users) and information providers (search engines) able to engage in the sort of marketplace conversation that the Cluetrain Manifesto authors advocate?

As an individual and as an Information Retrieval researcher, I am frustrated. One thing that strikes me is that there is a complete dearth of companies that allow a direct exchange of information seeking services for cash. No web scale search engines that I know of let me pay them for providing the level and quality of service that I demand. Instead, advertisers sit in a middle layer in between information seekers or buyers and information providers or sellers. Info buyers (users) pay advertisers in attention, advertisers pay info sellers (search engines) in real cash, and info sellers then give the info buyers an information service. The 3-way exchange of attention and money seems to work well, but at the end of the day it is not the kind of marketplace envisioned in the Cluetrain Manifesto. The buyers and sellers are not looking each other directly in the eye and coming to an agreement about the value and quality of the services exchanged. Rather, the discussion is being tempered by the advertiser, and by the seller’s obligations to the advertiser.

On top of that, there continues to be a lack of transparency in the plans and strategies department. Yesterday I commented about how evaluation drives innovation. But beyond a couple of corporate mottos, slogans and mission statements about putting the user first, I have little clue about what evaluation metrics are being used to drive the solutions to my information seeking problems. I have no idea what functions are being optimized, nor how well those functions relate to my explicit information seeking behaviors.

I agree with the Cluetrain Manifesto authors. I want to talk to these information seeking service providers, these search engine companies. I want to participate in the conversations that are going on behind the corporate firewall. And I want to make my needs known, directly, in my own voice and not through the statistically-aggregated voice of millions of people, many of whom are similar to me but many of whom are not. I do not think that we are quite there yet.

What I would really like to know is whether there is anything that I can do as an Information Retrieval researcher to help speed this process along. One of the reasons I am interested in Explanatory Search is that I have an intuition that this style of information retrieval can grease the way for better markets-as-conversations. It is not completely clear whether this will happen. Nevertheless, it is something that I think about as I develop my research questions.

Media Gatekeepers and Transparency

jeremy — Thu, 26 Mar 2009 20:40:22 +0000

PBS has an interesting article on the new media gatekeepers and the need for transparency in the process by which they promote media. Here is an excerpt:

The problem for these new gatekeepers is that they are providing the old editorial functions, but there’s a key difference between the way they operate and the way that movie critics, music reviewers and video store clerks operate: They are making editorial decisions without telling us who they are, what they like and how they are making those decisions. Otherwise, we will be left to wonder, left to come up with our own conspiracy theories, and we will lose trust in these services.

I believe this need for transparency is true not only for Twitter, Apple and YouTube, but for all types of search, including general web search. Search engines need to get better at at explaining why results were retrieved, lest users begin losing trust in those engines and/or find themselves ultimately unable to find the information they desire, due to an inability to correctly express their information needs.

Music Retrieval: Algorithms or Explanatory Context?

jeremy — Thu, 26 Mar 2009 12:25:22 +0000

At SXSW this year, Paul Lamere of The Echo Nest and Anthony Volodkin of Hype Machine engaged in a head-to-head panel about the utility of:

Using computer algorithms (e.g. collaborative filtering, tag-based, content-based, etc.) to automatically recommend music, versus
Using computers to (a) connect people who can directly recommend music to each other and (b) provide contextually relevant information around any shared songs

Perhaps I don’t fully understand the full subtlety of the conflict, but I find myself wondering: Why can’t you do both?

I am a strong advocate of content-based recommendation and retrieval methods, i.e. extracting rhythmic structure, harmony, timbre, and using this information as part of a music retrieval system. This helps you get at aspects of a song not easily describable in any other way. At the same time, I am also a strong advocate of Explanatory Search, and giving users more information about the retrieved item than just the item itself. It seems to me that if you can combine the strong voice of human explanation with the automated, and naturally exploratory ability of content-based methods you would have an unbeatable combo.

Music is the perfect environment for Exploratory Search, after all. The more ways there are to explore, the better.

I have not yet been able to find an audio or video recording of the session; if anyone comes across it, please let me know. In the meantime, here are Paul’s slides, and here is an open response by Anthony. Slide 56 on Paul’s presentation is particularly humorous, and Anthony makes a interesting point when he says:

The way to drive genuine discoveries without a significant reliance on collaborative filtering or recommendation algorithms is to intelligently select and present information that captures the context of a particular piece of music and creates meaning for the person interacting with the system. Use computer systems to connect people, spotlight individual voices, then have voices and social connections define what music everyone interacts with.