|
|
Workshop on Collaborative Information Retrieval (CIR 2011)
CIKM’2011, Glasgow, UK, October 28th.
http://cir2011.fxpal.com/
Organizers
———-
- Gene Golovchinsky, FX Palo Alto Laboratory, Inc, USA.
- Jeremy Pickens, Catalyst Repository Systems, USA.
- Meredith Ringel Morris, Microsoft Research, USA.
- Juan M. Fernández-Luna, University of Granada, Spain.
- Juan F. Huete, University of Granada, Spain.
- Julio C. Rodríguez-Cano, University of Informatics Science, Cuba.
Introduction and Goal
———————
This is the third workshop we are organizing on the topic of collaborative information retrieval. The first workshop, held in conjunction with JCDL 2008, focused on broad topics and sought to establish a vocabulary for discussion about collaborative information seeking, to identify work practices and disciplines that might benefit from collaborative information seeking, and to establish a community of researchers with related interests. The second workshop, held in conjunction with CSCW 2010, built on the previous results, and focused on issues of communication and awareness in support of collaborative information seeking.
Our goal in this third workshop is to focus on algorithmic and other software issues related to information seeking in a collaborative setting. Continue reading…
The web is abuzz this week with talk of the Google Books Ngram Viewer. It’s a great tool, and leads to some very interesting exploration and trend visualization. So does this tool fly in the face of my rant from a few days ago, about how Google’s improvements to search are all automated improvements, with no opportunity for the user to learn and grow?
The first problem is that because (most of) those 550 changes happen while the users are still “asleep”, users don’t actually notice them. Google doesn’t exactly go out of its way to make many of its search improvements visible to the user, and so it’s often difficult to tell whether or not something has happened. As a user, I personally don’t like that approach, because a change that is invisible or purposely hidden is a change that I as a user have no control over, and am not able to change back or alter further. And as I argued in an earlier post, the way to creating passionate search users is not to give them luxury seats without waking them up. Instead, the way to create passionate search users is to give them search tools that give users a path in which they can grow, improve, and get better at searching. Do users get better at flying, or at seeing and comprehending an information landscape from 30,000 feet, if they’ve got luxury chairs? Arguably not. If anything, the luxury chairs make it harder for users to sit upright, to have a “leaning forward”, engaged experience. Users are less inclined, pun intended, to be active participants in the experience. All the decision are being made for them.
On the surface, it would appear that users now have such a tool, a way to explore, compare, and learn. A way to lean forward at the edge of their seats (rather than lean back, asleep, in luxury chairs) and make search work for them, rather than the other way around. However, the problem is that this tool is still not connected back into an actual search. One can visualize trends, but one cannot actually find books that best exemplify these trends.
Take for example the Ngram Viewer query [science, religion]. Continue reading…
I fired up reddit this morning and was completely flabbergasted by one of the top posts. The title of the post was “This is Why I Use Google, Not Bing”. And it linked straight to this screenshot (which I reproduce here, in case the target disappears at some point):

This blew my mind, not only that an alphageek would prefer the (Google) interface on the left to the (Bing) interface on the right, but that the redditor alphageek community would so heavily upvote it. The way I see it, this speaks directly to the issues of simplicity as storytelling vs. sparsity that I’ve talked about from time to time. The interface on the left is anything but sparse. In fact, it is extremely busy and filled with images, a tool belt of various verticals (news, video images), query modification tools such as timelines and recency sorting, and query reformulation tools such as narrowly related searches (top middle) and broadly related searches (lower left).
In short, everything about it is “non-Googly” Continue reading…
A NYT books article about Kasparov and chess, and the relationship between humans, machines, and decision processes is making the Twitter rounds today. I don’t have time at the moment to write a long comment about it, but I do want to point out that it supports a position that I’ve been taking on this blog for some time now:
This experiment goes unmentioned by Russkin-Gutman, a major omission since it relates so closely to his subject. Even more notable was how the advanced chess experiment continued. In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. Normally, “anti-cheating” algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less “intelligent” than the playing programs they detect.)
Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.
The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.
This result seems awfully similar to some of the other results I’ve reported on in the past. Continue reading…
The Edge has published their annual question for 2010:
HOW IS THE INTERNET CHANGING THE WAY YOU THINK?
As an Information Retrieval research scientist, I of course was quite interested in what search folks had to say. I found this blurb from Marissa Mayer intriguing:
It’s not what you know, it’s what you can find out. The Internet has put at the forefront resourcefulness and critical-thinking and relegated memorization of rote facts to mental exercise or enjoyment. Because of the abundance of information and this new emphasis on resourcefulness, the Internet creates a sense that anything is knowable or findable — as long as you can construct the right search, find the right tool, or connect to the right people. The Internet empowers better decision-making and a more efficient use of time…
The Web has also enabled amazing dynamic visualizations, where an ideal presentation of information is constructed — a table of comparisons or a data-enhanced map, for example. These visualizations — be it news from around the world displayed on a globe or a sortable table of airfares — can greatly enhance our understanding of the world or our sense of opportunity. We can understand in an instant what would have taken months to create just a few short years ago. Yet, the Internet’s lack of structure means that it is not possible to construct these types of visualizations over any or all data. To achieve true automated, general understanding and visualization, we will need much better machine learning, entity extraction, and semantics capable of operating at vast scale.
It sounds like there is an increased awareness of (and respect for) Exploratory Search. I’ve heard this via private channels, but this is the first time I’ve seen an acknowledgment of the need for more exploratory search from such an official channel.
I do want to point out, however, that in order to make this work at web scale, we won’t just need better automated methods. I.e. we cannot rely solely on machine learning, entity extraction, or web-scale semantics. Rather, what is also desperately needed is a way for the user him- or herself to inject personal semantics and structure into the search, visualization, and comparison process. The search engine itself needs to be responsive to the structure that the user is giving to it, and rearrange itself around that information.
I am afraid that I am not being very clear in the vision that I’m attempting to lay out, so let me draw an analogy to parametric and non-parametric statistical modeling. Continue reading…
Greg Linden has an interesting post on Search on a domain like YouTube. I reproduce it here because I would like to elaborate on it:
The article focuses on YouTube’s “plans to rely more heavily on personalization and ties between users to refine recommendations” and “suggesting videos that users may want to watch based [...]
Chris Dixon has a post yesterday about search and the social graph. An interesting read, but what struck me the most was a tangent about how current search engines make money:
Lost amid this discussion, however, is that the links people tend to share on social networks – news, blog posts, videos – are [...]
Daniel T. has an interesting bipartite use-case model for exploratory search:
I know what I want, but I don’t know how to describe it. I don’t know what I want, but I hope to figure it out once I see what’s out there.
Perhaps this is a silly analogy, but framing the problem in [...]
TechCrunch is reporting a new Google Music service, purportedly to be released in about a week here in the U.S.:
Matt Ghering, a product marketing manager at Google, has been one of the people talking to the big four music labels about the new service, we’ve heard from one of our sources. And he [...]
What sort of information retrieval system would you build if you knew that all the users of your system would be expert or highly-motivated amateur searchers? What sort of system would you build when you have a very large collection of unstructured information, and the goal in searching that information is not to find one document (e.g. navigate to a home page), but to find (a) relationships between documents, or (b) large sets of documents that all pertain to a single topic? How would your algorithms be different? How would your interfaces be difference? How would the process itself (that middle layer in between algorithms and interfaces) be different?
Via Daniel Tunkelang’s recent post, I think that Government information might be a perfect domain in which to ask (and answer) these sorts of questions. The U.S. Open Government Initiative has as its goal the release of loads of raw government data for use by any individual or organization. How are people going to use this data? What types of questions will they ask? What types of questions could they ask, if given the proper tools (i.e. what might they not know that they want to ask, until it becomes possible?)
Two types of information retrieval might be perfect for this domain: Exploratory Search and (Explicitly) Collaborative Search. Continue reading…
|
|
Recent Comments