You can look at searches that are more popular in certain locations compared to others, and you compare the popularity of searches over time. For example, if you look at the trends for searches for “christmas”, “easter”, “valentines”, and “halloween” you’ll see that they are more popular in the days leading up to those holidays, peak on the day itself, and then abruptly drop again as people immediately lose interest.
People only look for info on holidays close to the date
In certain search words you can clearly see the pattern of the Northern hemisphere school vacations:
Nobody cares about math in summer or around Christmas.
Other keywords don’t have that as much.
Everyone always cares about coffee and chocolate.
Hey! Look at those peaks for chocolate! Don’t they look familiar? I overlapped the chocolate graph with that for Easter and Valentines Day to be sure (the Christmas peak was too high – it made the chocolate graph look like noise), and there it is: people search for chocolate around the holidays! It’s most clear in 2007 and 2008. You can also see the peaks around Christmas.
People search for chocolate around the holidays
This is all pretty mundane stuff, but Google can also be used to plot relevant trends. A team of researchers from Google and the Centers for Disease Control and Prevention showed that Google is faster at predicting when and where a flu outbreak will pop up than current methods.
The traditional way to spot an epidemic is to go by reported physician visits. If more people start seeing their doctor with flu-like symptoms, that is often the first sign that something is wrong. Physician visit stats are collected and published by agencies like the CDC, but that takes a few weeks to process. Epidemics can spread fast, so the earlier such trends can be spotted, the better they can be fought.
Now, a lot of people, when feeling unwell, look up their symptoms or medications in Google before (or even instead of) visiting a doctor. Fever, tired, headache, cold, Nyquil, vomiting, Tylenol – you name it. The researchers on this study followed trends of people searching for such keywords, and compared them with past data from the CDC that were based on physician visits in the corresponding areas. With certain keywords, analysis of Google data was able to detect the same local flu epidemics as the CDC had plotted. And not only that, Google could do it faster. Because it takes a while for people to visit the doctor and for the doctor to log the visit and for the CDC to collect these data and analyze them, Google was actually about one to two weeks ahead by grabbing the data immediately from the search logs.
The model based on CDC data (red line) follows the same trend as the model based on Google search info (black line) did earlier. (ILI = Influenza-like illness)
This could be a pretty useful tool. Pandemics become much harder to fight once they spread further, and knowing that you’re in the early stages of one is incredibly valuable. With a two week delay in information, you’ll realize that you were in the early stages of a pandemic two weeks ago which means that you could be in the middle of one right now!
Google set up a separate website for their Google Flu Trends project.
You can select a US state to see the current pattern of flu-related searches, and if you’re in that area, know if other people are experiencing the same symptoms. It’s not specific to smaller areas, so I doubt this would be of use to individuals. Still, if this is able to pick up emerging outbreaks faster than current tools, it could be useful for the CDC.
But above all, it’s just incredibly cool that someone got a Nature paper out of Google statistics. Can I publish the chocolate-holiday correlation, too?
Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, Larry Brilliant (2008). Detecting influenza epidemics using search engine query data Nature, 457 (7232), 1012-1014 DOI: 10.1038/nature07634