On the surface, Google’s new Google Trends service seems like it could be really powerful. By graphing the relative search frequency of comma-separated terms, you get instant snapshots of the collective consciousness. Trouble is, Trends does a terrible job of reading your mind.
The problem isn’t that the service is in beta — it’s the difficulty of crafting queries that turn up results that actually do demonstrate search trends. Is “diesel” really searched on so much more frequently than other fuels? ethanol, hybrid, hydrogen, gasoline, diesel. Change “gasoline” to “gas” or “petrol” and the chart changes so dramatically that you realize the apparent “trends” are virtually meaningless.
This result seems plausible on the surface: emo, hardcore, punk, alternative. But look at the associated news items and you’re reminded that the word “alternative” can mean virtually anything. Remove it from the query for better results.
Comparing the popularity of mac, linux, windows is really hard. Should you have used “OS X” or “Mac OS” rather than “Mac?” How could you consolidate all three terms to act as one in the query?
More so than with normal searches, the ambiguity or double meanings of certain kinds of words have huge potential to skew results. Try comparing the popularity of internet video formats: quicktime, real, windows media. The results don’t work because “real” means so many things. And it’s very hard to tune the search with variants like “real media” or “realvideo.”
Here’s one that actually is relatively unambigous: beefheart, zappa. None of the terms have other meanings, and people searching on these terms would probably almost always use exactly those terms. Expanding this to captain beefheart, frank zappa yields almost exactly the same chart.
On the other hand, here’s one that can totally invert results if you’re insufficiently specfic: thelonious monk, john coltrane. Now compare: monk, coltrane. In the first query, “coltrane” is way more popular. In the second, “monk” is. But according to the related news items, few of the “Monk” results refer to Thelonious Monk at all. Pay attention.
How does it do with politics? democrats, republicans reflects an even split — it’s captured the zeitgeist. But house, senate does something surprising: I expected the word “house” to screw things up since it’s so generic, but the associated news items indicate that it seems to be contextualizing the query — the graph might actually be limiting the the term “house” to political contexts.
You don’t have to use the service comparatively. Bush approval rating doesn’t reflect the arc of Bush’s approval rating, but how often people searched on that phrase (though I’m not seeing the upward spike in recent months I would have expected).
Gross differences in popularity can also result in less meaningful graphs. If you chart mp3, ogg, aac, wmv, MP3 so mightily outweighs the others that the alternative trajectories are virtually indiscernible. Ditch the string “mp3” for a clear reading of how other formats stack up.
Nor was I able to figure out whether more people prefer paper or plastic. This one can be better refined as paper bags, plastic bags, but the associated news items reminded me that the query really wasn’t addressing the question I thought I was asking. And besides, I would have to be careful to remember that people searching on “plastic bags” more than “paper bags” would only mean that people have more questions about plastic, not that people actually do choose plastic bags more often at the grocery store.
Fair enough, Trends throws a prominent disclaimer:
As a Google Labs product, it is still in the early stages of development. Also, it is based upon just a portion of our searches, and several approximations are used when computing your results. Please keep this in mind when using it.
The disclaimer should probably start with something like “Do not use Google Trends to settle bets!” The trouble with Trends goes deeper than it being in beta. Google is going to need a boatload of amazing AI to figure out the context problems. Amazing toy, but mired in caveats.