Self-Referrals in Google Analytics

One of the very useful reports in Google Analytics is the ‘Referring Sites’ report.  This tells you specific sites that have links to your site and are directing traffic to you.  With this info you can strengthen ties with these sites, aim for traffic from similar sites, and otherwise leverage these connections.

Referring Sites Report

Referring Sites Report

Common situation that causes head-scratching in Google Analytics is when a person sees that referrals are coming from his/her own site.  The referral below is all well and good…except the site it is referring visitors to is ‘bcbeer.ca‘:

self-referral example

self-referral example

These self-referrals can be caused by a couple of things, but I recently came across a post on Avinash Kaushik’s legendary Occam’s Razor blog where, in the comments below the article, Robbin Steif laid out the most common reasons for self-referrals:

  1. The site has subdomains or cross domains and the GA configuration is not set up to recognize these as part of the same domain.  A common scenario is the subdomain has the same tracking code as the main domain, but GA considers them separate entities, so visitors that move from the main domain to the subdomain are counted as referrals from the main domain. (For advice on how to fix: How do I track all of the subdomains for my site..?)
  2. The site may be set up correctly, but may have been set up incorrectly in the past (as the situation in #1), and there are still users with incorrect cookie information visiting the site.
  3. Traffic from untagged pages on your site to tagged pages.  Another good reason for ensuring that GA tracking code is pasted into every page on your site.

Another commentor on the same blog post suggested that self-referrals can also come from absolute links on your site:

“If the URL in your hyperlinks read http://www.yoursite.com/movies/movies.aspx …they should instead read /movies/movies.aspx.”

This seems unlikely, since in such a case the cookies on the user’s browser would indicate that she is not a new visitor to the site.  However, it may be worth looking into once the other 3 common causes are eliminated.

Advertisements

Excluding Visitors from Google Analytics via Cookie

Normally when you are setting up Google Analytics (or any other web analytics program for that matter), you want to set it up so that it doesn’t count traffic from visitors using the site on behalf of the site owner. This may include employees of the company that owns the site, web developers or online marketers contracted by the site owner, etc.

Usually this is done quite simply by using a filter to exclude the IP addresses of the visitors in question. There is a twist, in that you need to use regular expressions (which in this case means using the ‘\’ in front of the ‘.’s in the IP addresses so they are not treated as wildcards), but the process is pretty well explained in the Google Analytics help topic called ‘How do I exclude my internal traffic from reports?’. Google even provides an IP range tool to help with getting the right regular expression to cover a range of IP addresses.

Of course, this process assumes that the visitors that you want to screen out are operating from fixed IP addresses. For people working from home, or even small businesses, this may be not be the case, as they may using IP addresses dynamically provided by their ISP. The way to address this is through use of a cookie that can be set upon visiting a page dedicated for this purpose and then filtering out visitors based on the presence of the cookie.

This solution is addressed at the bottom of Google’s ‘How do I exclude my internal traffic…’ page, but I thought it was worth elaborating on. The information below is derived from the Google page but with some enlightening information from Brian Clifton’s excellent book ‘Advanced Web Metrics with Google Analytics.’

There are two stages to the process:

  1. Set up a dedicated page that will set a cookie on the visitor’s browser using Google Analytics code.
  2. Set up an exclude filter in Google Analytics that will cause GA to ignore traffic from visitors carrying the cookie

Step 1: Set a Cookie

  1. Create a new blank html page.
  2. Add a snippet of javascript to the body tag as shown in example 1 below.
  3. Insert the usual GA tracking code for the site on the page between the <body></body> tags.
  4. Save the page on your server with a name like ‘ga-exclude.html’.

Example 1: <body onLoad=”javascript:pageTracker._setVar(‘exclude_visitor’);”>

This _setVar() value is then stored in the _utmv cookie when a visitor visits the page.

Step 2. Add an Exclude Filter in GA

Analytics>Profile Settings>Edit Filter
Filter Type: Custom filter &gt Exclude
Filter Field: User Defined
Filter Pattern: exclude_visitor (must match the value of _setVar)
Case Sensitive: No

Google Analytics Filter for Exclusion by Cookie

Google Analytics Filter for Exclusion by Cookie

Brian says in ‘Advanced Web Metrics’ that the “value of setVar() is stored in the Google Analytics field labeled User Defined.” (p. 70) I take this to mean that the field labeled User Defined references the _utmv cookie (which holds the value of _setVar() ), but in any case, this explains the use of ‘User Defined’ in the filter field.

All that’s left to do is to point the visitors that you don’t want to track to the cookie page (‘www.yoursite.com/ga-exclude.html’). Once the cookie is set on their browser, it will be there for 24 months, during which time their visits to the site will not show up in Google Analytics. Voila!

Resources

To streamline the process, I have created a basic html template file that can be used as the file that sets the cookie.  To use this file:

  1. Download it and open it in a web editor/text editor.
  2. Insert the site GA tracking code above the </body> tag.
  3. If necessary, review the instructions in the file.
  4. Upload to the server and set up the exclude filter in GA.