Google Analytics Data Sampling
Improved data sampling was a noticeable upgrade when a new version of Google Analytics was rolled out a few months ago. So why is data sampling used, and how can it benefit reporting on marketing campaigns?
In order to manage large data sets, Google Analytics triggers its data sampling function rather than using resource to calculate the exact values. Data sampling is instigated when the following conditions are met:
Any non pre-computed ad hoc query that reaches 500,000 visits (sessions) Any query that exceeds 1,000,000 unique dimension combinations For example, Google Analytics will automatically display sampled data with any of the following reports or segmentation methods:
- Top Content Report
- Top Landing Page Report
- Custom Reports
- Advanced Segments
- Multi-channel funnels are sampled once the number of paths exceed 1,000,000
What is the reason for using sampled data?
The principal behind this function is that by using a subset of data, results will be comparable to using the full amount of data available. If you need access to the data quickly, using a smaller data set will speed up the reporting process but will be less accurate, whilst conversely - if you require a more complete picture, utilising a larger data set will slow down the report, but it will be closer to the real numbers.
Finding sampled data reports in Google Analytics
When you run a report for one of the above report types, a square grid appears below the data range box on the top right of the Analytics screen, for example:
When this is clicked a box appears with a sliding scale with 'Faster Processing' and 'Higher Precision' options at either end:
By default, when clicked this is always set to a medium accuracy level (usually around 30-40%)
Sliding the circular icon along to the left will provide a report which has less data, but faster reporting time:
Whilst sliding to the right will use a larger data set, but will take longer to produce: Even if the dial is forced to the most accurate level, there will never be 100% accuracy.
It should be noted that when you choose a sampling threshold within an Analytics session, that threshold level will be used in all reports until you close Google Analytics.
These reports are automatically triggered - however, the paid, enterprise edition of Google Analytics (Premium) enables access to reporting based on full un-sampled data sets.
In order to ensure that reporting is as accurate as possible, we will always base our reports on the highest possible sampled data levels for all Equator clients.
For more information on this function: