After installing Google Analytics on a website you should start thinking about ways to improve your analytics data accuracy. Google Analytics provides some smart tools allowing you to easily exclude well-known robots / spiders, to exclude traffic from your own network or to reduce the noises generated by the internet.
Internet noise? Yes, there is a lot of noise out there, generated by bots capable of running JavaScripts, browser extensions spoofing the HTTP Referer, cache websites or other applications with similar behaviour.
But wait, there’s more! Some malicious tools and browser extensions are intentionally generating fake pageviews and reporting wrong referrers to grab your attention. Why? Because it’s free advertising, that’s why!
Exclude traffic from well-known bots and spiders
To exclude traffic from well-known bots and spiders log into your Google Analytics account. Go to Admin, select the desired Account, Property then View, click on View Settings and select the check box option, labeled Exclude traffic from known bots and spiders:
By enabling the above option you’re instructing Google Analytics to apply a filter that will exclude all known spiders and bots listed in the IAB list (International Spiders & Bots List).
Exclude traffic for specific IP addresses
While Advanced Google Analytics for Joomla is doing a pretty good job on excluding the tracking code for logged in users (like Administrators), you should also consider using a filter to exclude your own IP, if it’s static. Companies can also exclude an entire network or multiple subnetworks to temper the noise generated by own employees while browsing the website.
To exclude traffic for specific IP addresses go to Admin, select the desired Account, Property then View, click on Filters and click on New Filter:
In the above example, I’m using a predefined filter to exclude traffic from a specific IP. To exclude an entire IP block you can use “that begin with” instead of “that are equal to“.
Exclude “fake pageviews” generated elsewhere on the internet
There are three more things impacting your Google Analytics data accuracy, which have an easy fix:
- lazy developers, which are starting a new project from your own design and forget to remove your analytics code or to disable your analytics plugin
- cache or archive sites, which are displaying your pages without removing your Google Analytics code
- spam, fake pageviews designed for advertising and to grab your attention; usually a page that doesn’t exist on your website is reported in your Google Analytics stats
To exclude this type of traffic go to Admin, select the desired Account, Property then View, click on Filters and click on New Filter:
You’ll have to enter your domain name in the Hostname field (only the root domain, without subdomains like www). Using this filter, your Google Analytics reports will only include the traffic from your own domain.
Fake referrers? There is not much you can do about it, because by filtering them out you could risk to actually exclude good traffic coming from users with an infected browser or a malicious extension.
There are more filters to be applied and actions to be taken, but this will require an in-depth analytics data accuracy audit; each website has its own particularities and its specific needs.
Note: When applying filters you should always create or have an unfiltered View, because once your reports are generated they can’t be re-processed. A misconfiguration of your filters can always lead to a permanent data loss. Because your analytics data won’t be re-processed, the filters will only apply to future traffic and will have zero effect on previously collected data.