Google flu trends

by Eva Amsen

ResearchBlogging.org Playing with Google Trends is a favourite pastime of geeks.

You can look at searches that are more popular in certain locations compared to others, and you compare the popularity of searches over time. For example, if you look at the trends for searches for “christmas”, “easter”, “valentines”, and “halloween” you’ll see that they are more popular in the days leading up to those holidays, peak on the day itself, and then abruptly drop again as people immediately lose interest.


People only look for info on holidays close to the date

In certain search words you can clearly see the pattern of the Northern hemisphere school vacations:


Nobody cares about math in summer or around Christmas.

Other keywords don’t have that as much.


Everyone always cares about coffee and chocolate.

Hey! Look at those peaks for chocolate! Don’t they look familiar? I overlapped the chocolate graph with that for Easter and Valentines Day to be sure (the Christmas peak was too high – it made the chocolate graph look like noise), and there it is: people search for chocolate around the holidays! It’s most clear in 2007 and 2008. You can also see the peaks around Christmas.


People search for chocolate around the holidays

This is all pretty mundane stuff, but Google can also be used to plot relevant trends. A team of researchers from Google and the Centers for Disease Control and Prevention showed that Google is faster at predicting when and where a flu outbreak will pop up than current methods.

The traditional way to spot an epidemic is to go by reported physician visits. If more people start seeing their doctor with flu-like symptoms, that is often the first sign that something is wrong. Physician visit stats are collected and published by agencies like the CDC, but that takes a few weeks to process. Epidemics can spread fast, so the earlier such trends can be spotted, the better they can be fought.

Now, a lot of people, when feeling unwell, look up their symptoms or medications in Google before (or even instead of) visiting a doctor. Fever, tired, headache, cold, Nyquil, vomiting, Tylenol – you name it. The researchers on this study followed trends of people searching for such keywords, and compared them with past data from the CDC that were based on physician visits in the corresponding areas. With certain keywords, analysis of Google data was able to detect the same local flu epidemics as the CDC had plotted. And not only that, Google could do it faster. Because it takes a while for people to visit the doctor and for the doctor to log the visit and for the CDC to collect these data and analyze them, Google was actually about one to two weeks ahead by grabbing the data immediately from the search logs.

Google flu trends
(From the paper but also available as video)

The model based on CDC data (red line) follows the same trend as the model based on Google search info (black line) did earlier. (ILI = Influenza-like illness)

This could be a pretty useful tool. Pandemics become much harder to fight once they spread further, and knowing that you’re in the early stages of one is incredibly valuable. With a two week delay in information, you’ll realize that you were in the early stages of a pandemic two weeks ago which means that you could be in the middle of one right now!

Google set up a separate website for their Google Flu Trends project.

You can select a US state to see the current pattern of flu-related searches, and if you’re in that area, know if other people are experiencing the same symptoms. It’s not specific to smaller areas, so I doubt this would be of use to individuals. Still, if this is able to pick up emerging outbreaks faster than current tools, it could be useful for the CDC.

But above all, it’s just incredibly cool that someone got a Nature paper out of Google statistics. Can I publish the chocolate-holiday correlation, too?


Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, Larry Brilliant (2008). Detecting influenza epidemics using search engine query data Nature, 457 (7232), 1012-1014 DOI: 10.1038/nature07634

Save

Related Articles

20 comments

Maxine Clarke February 19, 2009 - 5:47 PM

So, did you find your saved post, or is this a new one? I hope the former. Very nice blog, anyway – love the pictures.

Eva Amsen February 19, 2009 - 5:51 PM

This is the saved one. It was alllllllll the way at the bottom of my list of posts (more than 100). Oh, and my test post is still there as well. I should remember to turn that into a proper post (I don’t like wasting things, even electronically)

Cath Ennis February 19, 2009 - 6:24 PM

Great post, Eva! Google is going to take over the world, and I for one welcome our new overlords.
I saw a similar “analysis”:http://www.tuaw.com/2009/02/18/tracking-the-iphone-hype-generator/ yesterday of iPhone versus Blackberry / palm search peaks, highlighting the effectiveness of the Apple marketing machine.
Oh, and I once “blogged”:http://vwxynot.blogspot.com/2007/10/watch-movie-use-google-get-published.html about a _Current Biology_ paper about historical trends in left-handedness. The authors found a “modern control population” to compare to video archive footage by doing a Google Image search…

steffi suhr February 19, 2009 - 6:31 PM

Thanks Eva, this is cool!
Cath: _Google is going to take over the world, and I for one welcome our new overlords_
..isn’t that the name of one of your cats?

Eva Amsen February 19, 2009 - 6:33 PM

I think it was her cat that typed it.

Cath Ennis February 19, 2009 - 6:38 PM

Yes, I named her that in order to please my new overlords.
I was actually talking to some tech head friends last night about Google (the company). I said that I’m scared they’ll wait until everyone is using their search, email, analytics, calendar, maps, blogging, photos, documents and RSS reader services – and then suddenly announce that we need to start paying for them. The consensus was that as long as it doesn’t affect product quality, the tech heads would rather trust Google than the Canadian government…

Craig Rowell February 19, 2009 - 6:42 PM

Great post. I just typed in Science and Sports (very general categories). It was interesting to look at the break-down in Regions and Languages more than looking at the graph itself. There are strong regional differences in these two topics – Philippines and India lead in Science and US and UK love their sports.

Eva Amsen February 19, 2009 - 6:49 PM

Yes, that is a little scary, isn’t it? I was talking about the same thing a while ago with Richard, over either GMail or Google Talk…
I have all my stuff in there. All my e-mail accounts are aggregated. I save e-mail addresses by simply replying in GMail and letting it do it for me. I have drafts of everything I’m working on in Google Docs. My Google Calendar is especially important now that I have my iPod Touch, so that I can see what I’m supposed to be doing without having to use additional syncing software. I use Google Reader to (try to) keep track of stuff online.
Heck, I can barely find my own house without Google Maps!

Eva Amsen February 19, 2009 - 6:53 PM

My comment was in response to Cath. The fact that Craig’s wasn’t there yet when I started typing just shows how slow I am…
Anyway, Craig, yes, the regional differences are fun. Take a guess at which country is the top country for “Darwin”. (Then have a look, notice that you were probably wrong, and go “duh” when you find out why…)

steffi suhr February 19, 2009 - 7:02 PM

I just tried Darwin too, and went ‘duh’. But number 2, 3, 4… 5, 6..? Those are a surprise.

Cath Ennis February 19, 2009 - 7:06 PM

Ha! And no-one cares about Darwin at Christmas, either!

Darren Saunders February 19, 2009 - 7:23 PM

C’mon, the regional bias for Darwin is a no brainer if you live in the “other” hemisphere! Darwin and christmas would have had a V high correlation if Google was around a bit longer, the city got wiped off the map by a big Cyclone (“Tracy”:http://en.wikipedia.org/wiki/Cyclone_Tracy) around Christmas 1975.
Anyway, cool post… and cool tool

Eva Amsen February 19, 2009 - 7:29 PM

_”And no-one cares about Darwin at Christmas, either!”_
That inspired me to search for “Jesus”, who predictably peaks at Christmas and Easter – which _of course_ led to me overlaying “Jesus” with “chocolate”. (Now there’s something I’m sure I haven’t said before). Interesting finding: At Christmas, chocolate is more popular than Jesus, but at Easter it’s the other way around. Overall they’re equally popular. (Which _of course_ led me to start an entire religion based on chocolate…)
!http://farm4.static.flickr.com/3649/3292777609_c3ff533fb2_o.jpg!

Craig Rowell February 19, 2009 - 7:34 PM

In this Chocolate-based religion can you make sure that artificial sweeteners receive the same respect as false idols.

Cath Ennis February 19, 2009 - 7:34 PM

Ahem:
Oh, and also “this”:http://www.msnbc.msn.com/id/11669242/.

Eva Amsen February 19, 2009 - 7:36 PM

_”the city got wiped off the map”_
Off the Google Maps?

Eva Amsen February 19, 2009 - 7:38 PM

_”can you make sure that artificial sweeteners receive the same respect as false idols.”_
Naturally!

Maxine Clarke February 19, 2009 - 8:22 PM

See, I said that “losing” the pre-post had nothing to do with MT4 😉

Louis Shackleton February 19, 2009 - 9:24 PM

Wow, that really is a very cool way to use teh google.

Darren Saunders February 19, 2009 - 10:27 PM

What WOULD be cool is if you use that tool with Google Scholar?

Comments are closed.