When Big Isn’t Better: How the Flu Bug Bit Google

"Big Data" is currently being proposed as a tool for addressing every form of human endeavour from the design of products to that of buildings and public spaces. The understanding of disease, for example, had been said to benefit from algorithms which monitor their spread. But without contextual information, the numbers can be misleading.

“The Parable of Google Flu: Traps in Big Data Analysis” which has recently been published in the journal Science examined Google’s data-aggregating tool Google Flu Trend, which was designed to provide real-time monitoring of flu cases around the world based on the Google searches which matched flu-related semantics. Google Flu Trend was however found to overestimate the prevalence of flu in the 2012-2013 season, as well as the actual levels of flu in 2011-2012, by more than 50 percent. Further, from August 2011 to September 2013, it over-predicted the prevalence of flu in 100 of the 108 weeks. The research raises important questions regarding the accuracy which can be achieved with other tools as well, such as Twitter and Facebook. The researchers highlighted in their conclusions the need to integrate multiple forms of data, including traditional techniques, in order to achieve a more complete understanding of the actual context which produced the data.