Evaluating single-sample Google Trends research studies: What’s hype and what’s not

Jacques Raubenheimer, Jonathan Inkiriwang, Adela Wu, Nicholas Buckley

Background: Digital health studies using Google Trends (GT) have increased exponentially since the release of the first GT paper in 2009. However, the recommendations of two systematic reviews (Nuti et al, 2014; Mavragani et al, 2018) have largely been ignored. Also, only a handful of several thousand studies have used multiple sampling techniques to improve their estimates. The impact of inadequate sampling is unknown.

Aims: To re-examine published studies using multiple sampling from GT and estimate the impact of inadequate sampling.

Methods: We identified four studies with published data: Lazer et al (2014), Gamma et al (2016), Husnayain et al (2019), Schneider et al (2020). We replicated the methods of each study using their own data, confirming their results. We extracted 130 samples of GT data using the same specifications (time, region, terms) as the authors. We repeated the four article analyses on each sample and tallied the impact on the results.

Results: Each study showed some variation in its results across multiple samples, from minor, but noticeable, to significantly different outcomes. As expected, the best estimate was obtained using the mean series across all 130 samples.

Conclusions: Research using GT needs to take multiple samples of data to obtain accurate estimates before analysis. The number of samples needed appear to be related to the time frame (earlier data require more samples). The four studies all used different methods of analysis, and more research is needed to determine the extent to which different analysis methods are impacted by inadequate sampling.