Comments on: Is Good Enough Good Enough?

By: jeremy

jeremy — Mon, 31 Aug 2009 20:14:37 +0000

In reply to Aaron Beppu. Btw, I really am not a fan of kosmix or mixing in videos or photos with web pages. If I wanted a photo, I would search on flickr. If I want a video, I’ll search youtube. I’m not sure under what conditions I’ve thought ‘I want kind of online content related to x; I don’t particularly care if it’s a photo or a journal paper or a web page or a news article’. Getting back to the ‘variety versus quality’ point above, I’d rather have separate search engines for web pages, photos, videos and news items that do a decent job than something that combines them all in a way it thinks is superior. Well, then you must currently hate what Google does in its regular results? What do they call it.. Universal Search? Aggregated search? Google mixes results from various types (images, videos, etc.) into a single results list. I agree; I personally like to be able to choose my data type. But from that perspective, Kosmix does a better job than Google, because each of the data types/sources (e.g. Twitter vs. blogs vs images vs video) is clearly labeled and separated. It's not too much of a stretch to imagine an extension to Kosmix in which you can click not a single result, but the category as a whole, and get taken to that vertical. But remember, Kosmix is more than just a mixture of data types. It also has that whole "Related in the Kosmos" section, which is a bit more exploratory.

By: jeremy

jeremy — Mon, 31 Aug 2009 18:49:50 +0000

In reply to Aaron Beppu.

I’ve bought into the relatively narrow box that IR is packaged in

It’s not a bad box to be in. I’m just trying to drive home the point that it’s not the only box.

From this narrow perspective, it doesn’t look like we’re doing that badly on exploratory searches, or that our performance is that different from factoid searches.

This is where I think I disagree. See in my previous comment the Prague Cafes example information need. How well is a ranked list of documents doing on helping meet that information need? Try it yourself. See if you can find the cafe that I’m talking about. See if you can identify more than 2-3 cafes in Prague that are found in passages or down small, side lanes. I personally think we’re doing a terrible job.

That is, I don’t have any particular disagreement with the results I got back (none are useless, and nothing crucial is missing) and the ordering is also fine.

Another example that I like to use is “digital cameras”. I do a G/Y/B search using that query, and it’s true: All 10 of the top results are pretty relevant/useful. However, unlike you, I don’t have a good sense about whether anything crucial is missing. Indeed, if I’m trying to figure out what cameras to buy, what is important in a digital camera, what the best cameras are for various use cases, what things I should look out for, those top 10 links don’t give me any sense of whether I’ve found everything I need to make that decision. I don’t know what I don’t know, and the results of my search also don’t really help me learn what I don’t know.

But stepping back, and asking whether the framework of a query and a results list is appropriate to the information need, I agree. Factoids/lookup fits much better in the current model than exploratory search.

Yes, that was one of my main points.

However, I think the kind of exploratory search you’re talking about, with both exploration of the documents and exploration of the search process is a much harder problem.

I agree; absolutely. But aren’t we up to the challenge? Just because it’s hard, does that mean we will continue to shy away from the problem, the same way we’ve been shying away from the problem on the web for the past 10 years? I’m a researcher; I have to believe that we can do more than we’re currently willing to do.

I don’t know that trying to do that now would actually yield better results.

I think one of the things that we have to do is figure out what “better results” means. Currently, search engines are evaluated using NDCG, i.e. gradated precision of a single ranked list, up to the top 40 returned results. Is this even the appropriate metric to measure exploratory search? When your goal is learning, relationship discovery, summarization, etc., does the NDCG objective function accurately measure what you want it to measure? I’m going to go out on a limb and say that it does not. And as a researcher from MITRE once told me, “evaluation drives innovation”. What you choose to measure affects, in largest part, the resulting system that you end up creating. If you can’t (or don’t or won’t) measure something, you’ll never build the right system to meet a particular type of need. Therefore, what “better results” means is something that we need to talk about, instead of just using existing metrics.

But all of these depend on pretty clean and often structured data.

Does the Modista example really depend on clean and structured data? Or does it do smartly-algorithmic content-based shape / color histogram / etc. feature extraction? I would hope the latter. In fact, I think there is a lot of room left for approaches that use smarter algorithms to drive different types of similarity than we currently optimize for, rather than relying on more data to improve an existing similarity algorithm.

Well, I’ve talked to some senior folks at (for example) Google, and have been told that the approach they take is to release early and often, and constantly iterate. Get the product out there, get it into the hands of users, and then constantly improve. And yet, despite that being their declared approach, I agree that I haven’t actually seen them do that much exploratory search, at all. They’re doing exactly as you say.. acting quite conservatively and sticking by the narrowest of models. Where’s the innovation in that? If you’re not actually releasing anything, how can you collect all the data to improve it? Catch-22, right? As long as it’s made clear to the user that this is an experimental system, i.e. give it that perpetual BETA tag that they use, what’s wrong with a little bit of garbage? The whole point of exploration is that you want to learn something. In that context, nothing is ever truly, completely garbage.

Getting back to the beginning, I don’t know that we can in short reach provide drastically better quality from a small boutique effort for a small audience.

We won’t know, though, until we try. So where’s the try?

I would expect that we can make gains at targeting particular kinds of search — and the place where I search for the most authoritative thing and the place where I search for the closest matching thing can be different.

I know they can be different, but why would they be different? If a search engine’s mission is to organize all the world’s information, then that should also include all the world’s information needs, both factoid/known-item as well as exploratory, right? Why do you think that they (should?) be different?

By: jeremy

jeremy — Sun, 30 Aug 2009 23:19:29 +0000

Aaron — I owe you (and will give you) a longer, more thoughtful response. I’ve thoroughly enjoyed reading what you’ve written. I’ll probably get to that response tomorrow.

Until then, though, let me just give you one example where I think current search engines fail, both in algorithm as well as interface. I wrote about this in April:

http://irgupf.com/2009/04/23/retrievability-and-prague-cafes/

The information need I describe in that post is not factoid, but it is also not completely exploratory. It’s more of a multi-factoid. However, that by itself pushes it more in the exploratory direction. It’s a question to which there isn’t really one right answer; it’s a question to which the whole set is the right answer. It is, as you say:

“If we want to understand documents well enough to pick a subset of them that provide comprehensive coverage of the topic is difficult. And then what you’re talking about is no longer just a search engine — it’s a summarization/composition/organization engine.”

Maybe we really would no longer call it a search engine. Instead we should call it an information retrieval engine. Because information retrieval, rather than the narrow subset of information retrieval that webbies call “search”, does include the summarization, composition and organization subtasks. Organize the world’s information, right? Not just find home pages. But really organize.

Anyway, this is a short response, and I will be writing more tomorrow. I just wanted to quickly point you to that other blog post, as an example of one of the things that “search” engines should be able to do, but do not.

By: Aaron Beppu

Aaron Beppu — Sat, 29 Aug 2009 02:08:56 +0000

Btw, I really am not a fan of kosmix or mixing in videos or photos with web pages. If I wanted a photo, I would search on flickr. If I want a video, I’ll search youtube. I’m not sure under what conditions I’ve thought ‘I want kind of online content related to x; I don’t particularly care if it’s a photo or a journal paper or a web page or a news article’. Getting back to the ‘variety versus quality’ point above, I’d rather have separate search engines for web pages, photos, videos and news items that do a decent job than something that combines them all in a way it thinks is superior.

By: Aaron Beppu

Aaron Beppu — Sat, 29 Aug 2009 02:08:40 +0000

Ok, so I think a large part of my initial problem with the “we’re bad at exploratory searches and good at factoids” may be that I’ve bought into the relatively narrow box that IR is packaged in — the user gives a query and the search engine gives back a list of results, and the quality of the response is based on the ‘relevance’ of items on that list, and the degree to which their ordering on the list reflects that notion of relevance. From this narrow perspective, it doesn’t look like we’re doing that badly on exploratory searches, or that our performance is that different from factoid searches. That is, I don’t have any particular disagreement with the results I got back (none are useless, and nothing crucial is missing) and the ordering is also fine. But stepping back, and asking whether the framework of a query and a results list is appropriate to the information need, I agree. Factoids/lookup fits much better in the current model than exploratory search. And saying that we can do an ok job on a much simpler problem than the user actually has is pretty weak.

However, I think the kind of exploratory search you’re talking about, with both exploration of the documents and exploration of the search process is a much harder problem. Setting aside the issues of the user tweaking the search process, which I would have to think more about, just thinking about a richer exploration of the content once you’ve retrieved it is a really hard problem. It think if we stay safely behind the line of returning document lists, we can do ok. But if we want to pull out pieces of information from documents and assemble them, that’s pretty tricky. If we want to understand documents well enough to pick a subset of them that provide comprehensive coverage of the topic is difficult. And then what you’re talking about is no longer just a search engine — it’s a summarization/composition/organization engine. I would certainly like to see it, but I don’t know that trying to do that now would actually yield better results. I think product search moving in the direction of what you want. Modista, for instance, allows you to search for products on a ‘looks like this’ example basis, and then organizes results in a way that’s not a straight ordering, but rather organizing along axes of shape and color. And lots of other places have ‘similar products/movies/artists’ information. This isn’t as rich as you want, I don’t think, but it’s moving in that direction. But all of these depend on pretty clean and often structured data. But web search doesn’t have access to that kind of information, and I never want a page that’s really close to mine anyways — I want something different but related.

So if it’s true that we can’t really do exploratory search yet — and I think our NLP is still weak enough and our pages are still messy enough that this is true — then it makes sense that people offering search will still subscribe to the narrow model, because though the results may not be satisfying in the way that we want, at least it’s not the frustration of a garbage response from a fancy NLP attempt to write a new document from bits of several relevant ones, for instance. Getting back to the beginning, I don’t know that we can in short reach provide drastically better quality from a small boutique effort for a small audience. But I would expect that we can make gains at targeting particular kinds of search — and the place where I search for the most authoritative thing and the place where I search for the closest matching thing can be different.

By: jeremy

jeremy — Thu, 27 Aug 2009 06:33:13 +0000

If I search for ‘information retrieval’ maybe the search engine could take a bunch of highly relevant documents and distinguish them not by some ordering but segmenting by the kind of document they are — e.g. ‘read this encyclopedia article for a quick intro, for a more technical introduction see lecture notes from any of these twelve university courses, this conference proceedings will give you an picture of current research, if you want to read a book on it any of these three are good …’. Or are you thinking that the interface is too restrictive?

No, I think what you’re proposing is a very nice first step along the exploratory path. As a matter of fact, see this dynamically constructed page:

http://www.kosmix.com/search/information_retrieval

Now, if there were only a way to make this process interactive, as well. Refinement. Refactoring. Comparisons and contrasts.

For instance, I still want a better way of seeing how groups of documents interrelate, so I can easily go from my x to a view of a representative sampling of pages linking to x and with snippets surrounding the link — and then rather than issuing a query with keywords, the typical query could be a page.

I agree; I also want to see how documents and groups of documents interrelate. How concepts and phrases interrelate. These sorts of things are indeed more exploratory, less like fast food.

But a Google ranking doesn’t give you any way of seeing how things interrelate. They just give you a Happy Meal ranking, and hope that the top half dozen items in the box are enough to fill you up, that you have no need to learn, explore, compare, etc. beyond that.

What do you think; am I really so out on a limb by thinking this way? It certainly isn’t the most popular way to think about IR.

By: jeremy

jeremy — Thu, 27 Aug 2009 06:23:33 +0000

If I do a google search for ‘information retrieval’, I get wikipedia, then two books, a journal, another book, a google directory, a course web page — all potentially very fertile sources of information. I think this was certainly an exploratory query — there’s no single factoid that would work for me, and it’s not page lookup or navigational as I probably want a variety of pages.

Aaron,

I think Gary Marchionini does a much better job at explaining exploratory search than I ever could. Give these few pages a read:

http://www.ischool.utexas.edu/~i385t-sw/readings/Marchionini-2006-Exploratory_Search.pdf

And let me quote:

Searching to learn is increasingly viable as more primary materials go online. Learning searches involve multiple iterations and return sets of objects that require cognitive processing and interpretation. These objects may be instantiated in various media (graphs, or maps, texts, videos) and often require the information seeker to spend time scanning/viewing, comparing, and making qualitative judgments. Note that “learning” here is used in its general sense of developing new knowledge and thus includes self-directed life-long learning and professional learning as well as the usual directed learning in schools. Using terminology from Bloom’s taxonomy of educational objectives, searches that support learning aim to achieve: knowledge acquisition, comprehension of concepts or skills, interpretation of ideas, and comparisons or aggregations of data and concepts.

Your search for “information retrieval” does pull up a couple of good sources from which to start your exploratory search (learning) process. But how comprehensive are those sources? Does the search engine help you get a good overview of the field? Can you easily use the search engine to compare and contract the main descriptive keywords used in published information retrieval documents from 1990-2000 vs. 2001-2009? Does the search engine reveal to you the myriad of information retrieval forms.. image retrieval, video retrieval, music retrieval, expert finding (people retrieval), and 20 other subfields? Is the search engine interactive, so that you can say which of the results you wanted and didn’t want, and get a dynamic re-ranking of the rest of the list, to discover other aspects of the topic that you might not have found, otherwise? Can you do a “sort by least recent” modification of your query, and see what the earliest works were in the field?

To clarify about “fast food”, I use this term to refer to the fact that most major search engines basically offer a set menu, a Happy Meal #3. That set menu is the relevance ranking the way *they* think is most relevant. And there are usually very few ways you can modify that ranking, very few ways you can interact with the engine to produce a different sort of outcome. The ranking is optimized toward the mass market needs, and mass market tastes. (Let me also credit Daniel T. for frequent use of this analogy. See http://thenoisychannel.com/2009/02/05/the-banality-of-crowds/ and http://thenoisychannel.com/2009/01/08/google-tech-talk-reconsidering-relevance/ )

That’s not what search is supposed to be. Search is supposed to be flexible. User controllable. It is suppose to respect my definition of relevance, not the mass’ definition. As Marchionini writes, it should enable me to do things like accretion, analysis, exclusion/negation, synthesis, evaluation, discovery, planning/forecasting, and transformation. From within the search engine. Not manually, post facto.

By: Aaron Beppu

Aaron Beppu — Thu, 27 Aug 2009 02:48:17 +0000

Can you expand on your distinction between exploratory versus factoid versus recall-oriented information needs, and how you think we’re doing on these? If I do a google search for ‘information retrieval’, I get wikipedia, then two books, a journal, another book, a google directory, a course web page — all potentially very fertile sources of information. I think this was certainly an exploratory query — there’s no single factoid that would work for me, and it’s not page lookup or navigational as I probably want a variety of pages. I don’t know that we’re actually doing worse here than for factoids — certainly there are a lot of factoids that probably aren’t central to any particular document, and I think we’re a ways off from doing a solid job of automatically combining information from multiple documents in a meaningful way. I understand that it’s harder to measure performance for exploratory queries than for navigational ones; a query for ‘nate silver’ better have fivethirtyeight.com at the top of the results list, but broad exploratory queries may have a large number of mostly comparable highly relevant documents.

I can maybe imagine that the narrow framework of receiving a query of a small number of words and giving back an ordered list of documents is better suited for factoid or navigational queries than for exploratory searches. If I search for ‘information retrieval’ maybe the search engine could take a bunch of highly relevant documents and distinguish them not by some ordering but segmenting by the kind of document they are — e.g. ‘read this encyclopedia article for a quick intro, for a more technical introduction see lecture notes from any of these twelve university courses, this conference proceedings will give you an picture of current research, if you want to read a book on it any of these three are good …’. Or are you thinking that the interface is too restrictive? For instance, I still want a better way of seeing how groups of documents interrelate, so I can easily go from my x to a view of a representative sampling of pages linking to x and with snippets surrounding the link — and then rather than issuing a query with keywords, the typical query could be a page.

And just to clarify about ‘fast-food’ — do you use this term to refer to using the search engine to go somewhere else quickly, such that the search engine is only a quick stop over? Asked another way, in your vision of an improved search engine, would we spend more time looking at the results list, or some combination of information from multiple sources on the engine’s page rather than clicking off to inspect one of these documents one at a time? Or does fast food merely refer to a low bar with regards to quality?

By: jeremy

jeremy — Wed, 26 Aug 2009 06:24:59 +0000

Imho. 🙂

By: jeremy

jeremy — Wed, 26 Aug 2009 06:24:12 +0000

But I think the distinction for IR and search isn’t convenience versus quality.

If you search engine retrieval quality is optimized toward “convenience”-type information needs, then I agree, there isn’t a distinction. (By convenience-type needs, I mean home page finding, product lookup, fact lookup, movie-start-time retrieval, weather retrieval, etc.)

If, on the other hand, you have an exploratory information need, or have a recall-oriented information need, then the quality is pretty low at the moment. That’s because the search engine has been optimized toward quick-answer, factoid, fast-food convenience retrieval.

This is more than just one-size-fits-all vs. vertical search. I see where you’re going with that, but I’m not talking about any one vertical. I’m talking about general web information, but with a more exploratory or recall-oriented bent. Perhaps it is technically easier to go deeper in any one vertical. I think that’s the intuition that you have, and I don’t disagree with it. I’m just sayin’ that’s not the full story. A recall-oriented need doesn’t *have* to be vertically aligned, only. There are many that are general.

I agree with you about variety preservation. That’s an excellent point. But again, I would not align variety with verticals. Or document types. Or domains. I think you’re getting closer with your authority vs. raw text match idea. I would rather align variety with broader information seeking behaviors: exploration, learning, recall, etc.

Right now there is no variety on the web, in that everything is oriented toward high-precision factoid retrieval. No matter what vertical, no matter what document type. Informationphiles love exploration, learning, etc.