A week or two ago I began writing a few thoughts about large-data based algorithms and retrievability. It was spawned by the Unreasonable Effectiveness of Data position paper by a couple of notable Googlers, which then led to a brief discussion.
My main contention was that by relying to heavily on algorithms that are based solely on accumulations of large-data, and by not offering users exploratory search options to turn off the large-data, popularity bias, searchers would be unable to ever find certain pieces of relevant information. This is not even a matter of knowing the correct query terms to use; I argued (backed up by published research) that even if you knew the correct terms, you still could not find certain pieces of information.
Well, now I want to write about the other half of the equation: What do you do when the information is retrievable under some term, but you just do not know that term? Why do search engines not give you more help with finding information which does exist if you know exactly the right word to use, but for which no reasonable person would ever know the correct word?
Let me give an example: Hidden Cafes in Prague.
When I travel to foreign cities, I often like to seek out the off-beat, the far from the beaten path areas, and cafes. I don’t want to sit downtown in some overpriced tourist cafe. I want to find that place tucked away down the long windy lane. And the city of Prague is perfect for this sort of activity. It is filled with all sorts of dark passageways and hidden lanes. Just look at some of them, via Flickr: Pohorelec passage, a passage near the castle, a random alleyway, a covered passage between buildings, filled with shops, another passage near the castle, Tesla passage, a passage between buildings so skinny it has its own traffic light, another random passage, etc.
So what I would like to do is use a search engine to find a lot of these hidden, off-the-beaten-path cafes. Now, I’m sure if I typed in the name of every single one of these cafes I would easily be able to locate all of their addresses. But that is the point — not only do I not know where these cafes are; I don’t know what they’re named. I only know that they must exist.
If I query for [prague cafe] I do not find what I need. If I query for [hidden prague cafe] it also doesn’t do the trick. If I query for [prague passage cafe] I get a couple of the cafes in the larger shopping passages. But none of the smaller, more interesting, hidden cafes.
In particular, after about 20 minutes of trying, I have been unable to come up with a query that lets me find the cafe that I wish I would be able to find, if I didn’t already know about it. It’s the cafe at the hotel U Raka. Walk up to Prague castle, then keep going. You’ll hit a small lane out back, Novy Svet, and meander through its quiet, almost magical streets. In fact, here are some pictures taken in the wintertime, which is when I actually discovered it many years ago. When you get to the end of the road, you’ll hit a large wall, and you’ll turn right onto Cerninska. Almost immediately in front of you is the cafe at U Raka (at night, and during the day — see the cafe sign?)
So I can find this cafe because I know its name. It is “retrievable” by today’s search engines. But the problem is that most people looking for off-path cafes in Prague will not know its name. What we need, then, is an interactive search engine that helps us find this information, when all we have to start with is the query [hidden prague cafe]. It’s a tall order, but it’s a worthy goal.