Via Xavier Amatriain: The Dirty Little Secret About the “Wisdom of the Crowds” – There is No Crowd:
This is hardly the first time that the so-called “wisdom of the crowds” has been called into question. The term, which implies that a diverse collection of individuals makes more accurate decisions and predications than individuals or even experts, has been used in the past to describe how everything from Wikipedia to user-generated news sites like Digg.com offer better services than anything created by a smaller group could do.
Of course, we now know that simply isn’t true. For one thing, Wikipedia isn’t written and edited by the “crowd” at all. In fact, 1% of Wikipedia users are responsible for half of the site’s edits. Even Wikipedia’s founder, Jimmy Wales, has been quoted as saying that the site is really written by a community, “a dedicated group of a few hundred volunteers.”
Still, there [has] yet to be a perfect solution to the problem. Perhaps it’s time we give up the idea that the “wisdom of the crowds” was ever a driving force behind any socialized, user-generated anything and realize that, just like in life, there will always be active participants as well as the passive passerbys.
I have never quite liked the notion of “wisdom of crowds”, and the hype behind it even less, so I”m glad to see signs that the hype cycle is finally starting to wind down. However, by having to confront exactly what it was that I didn’t like about the notion, I was intellectually forced to propose an alternative: Explicit Collaboration in Search. As I wrote half a year ago:
[In early 2006], I was having a visceral reaction against all the hype surrounding “collective intelligence” and “wisdom of the crowds” as a primary basis for doing information retrieval. I was not interested in collaboration as massive data crunching on top an anonymous crowd, nor even on top of a known crowd of friends. I wasn’t interested in crowds of any kind. Rather, I thought (and still do) there is a lot more value left to be extracted from content-based search methods.
However, where these thoughts soon led me was to a notion of collaboration in search quite different from “wisdom of crowds” methods. With Maribeth Back and Gene Golovchinsky, I envisioned collaboration as a sort of “musical jam session”, where a small set of common-goal searchers got together and “played” their search “tunes” together over a content-based retrieval back end. The purpose of this jamming wasn’t to repeat each others’ notes (”people who play this note also play that note”) but to play melodies and baselines that were different, but that worked together toward the larger goal of creating a full “song”, a commonly constructed set of relevant information. To date, this notion of collaboration in search has proven, and continues to be, quite fruitful. There are all sort of research “melodies” left to be played, all sorts of songs left to be sung, by all sorts of researchers. I continue to be excited about this notion of “search jamming”, and look forward to the solutions that the community will continue to invent.
I have not spend too much time blogging about this ongoing collaborative information seeking research on IRGupf. Most of what (Gene and) I have written on that topic appears on the FXPAL Blog (see ). Still, I found the aforementioned “There is no Crowd” post interesting enough to warrant a brief mention here. I find Perez’s (the author’s) conclusion compelling: Perhaps it’s time we give up the idea that the “wisdom of the crowds” was ever a driving force behind any socialized, user-generated anything. If you want to put the “social” into your information seeking systems and algorithms, then it makes sense to really make it social, i.e. there should be explicit interaction with a known person, rather than implicit, passive behavioral aggregation.
[Note that explicit collaboration in information seeking is only one of many possible alternatives to the “wisdom of the crowd”. Xavier proposes a different solution in a paper published at SIGIR 2009.]
While the “wisdom of crowds” has been even more oversold than the “long tail”, I do think there are places where it is both possible and effective to leverage a mass of passive participants, e.g., by mining search logs. Granted, it is a limited sort of wisdom, but it’s certainly useful for tasks like obtaining a vocabulary for content tagging.
Yes, there are indeed places where it works. But more attention needs to be paid, and more discussion needs to happen, about where those places are, as well as what the limitations are. It’s true; mining search logs helps. But it helps for short snout, navigational types of information seeking. Not so much for exploration 🙂 Because the point of exploration is that you’re going to take a different path than all those who came before you, not the same path.