Google Analytics: Rewriting Page URLs to include ‘WWW’

Avoid disaggregating page data in Google Analytics by applying a filter to force all ‘www’ domains to be displayed in Content reports as ‘www’ version (http://www.mysite.com/page.html as opposed to mysite.com/page.html).

Out of the box, Google Analytics displays URLs in Content reports without using the domain name (‘page1.html’ instead of ‘http://www.mysite.com/page1.html‘) . This makes sense in most cases, since it is redundant and you probably know what your domain name is. However, there are a variety of situations where you may want to display the full domain name, most commonly when your site is spread across multiple subdomains. (http://www.mysite.com, store.mysite.com, blog.mysite.com, etc.)

It is easy enough to add a filter to your profile that will cause the full domain/subdomain to show up in your Content reports. (And, of course, if you want to track across multiple domains and subdomains, you’ll need to modify your GA tracking code to accommodate this.)

Potential Issue: Page Data Split Between Two Versions

All well and good, but there is an issue that arises when the full URLs are displayed in Content reports on sites where visitors can access the site at both ‘www.mysite.com’ and ‘mysite.com‘. As a result of these two versions of  the domain, the same page may be reported on separately, in the ‘www’ and ‘non-www’ versions:

Page data split between www and non-www

In the example above, the same page is shown in two separate versions, one with 16 pageviews and one with 6 pageviews. Not cool.

For search engine optimization,  this causes canonicalization issues and is best dealt with via 301 redirect. However, this may not always be possible – at least in the near term, particularly if you don’t have access to your server settings – and you may want to have your data as accurate and relevant as possible NOW.

Solution

By applying an additional filter ahead of the filter that adds the domain to the adds the domain to the URI, we can force Google Analtyics to include the ‘www’ at the beginning of the domain in cases where it is not already present. The desired result:

Page data consolidate as 'www'

Here we can see that the data for ‘default.aspx’, previously split between two ‘pages’ in the report, is now consolidated to give us a more relevant picture of what is happening with visitors to the site: 22 pageviews of this page (16 + 6). Aaahh…that feels better!

Two Filters Used: one to add ‘www’, one to show full URL

This solution was reached by applying two straightforward filters to the GA profile:

1. A filter to recognize situations where the hostname starts with the ‘raw’ domain without ‘www’ (‘fig4.com’ in this example) and then adds ‘www’ at the beginning of the hostname in these situations. This filter will not add ‘www’ in cases where it is already present, nor will it add ‘www’ in cases where the page is on a subdomain. It does assume that you have a single domain, so it would have to be modified in the case of cross-domain traffic. It also assumes that you want to add ‘www’ to all URLs, as opposed to removing ‘www’ from all URLs. If you prefer no ‘www’, just flip the fields around.

Filter 1: Add the WWW

2. The usual filter for displaying the full URL including domain. I have included a leading ‘/’ in the Output To -> Constructor – this is not typically recommended in official documentation, but I did see it recommended somewhere by one of the big names in the field, so I figured I should try it and have seen no adverse affects.

Filter 2 - included domain

That’s all there is to it. Works for me and I hope it will work for you, but let me know if you have any feedback.

This post goes out to my friends at www.CIGNA.com (or CIGNA.com if you prefer 🙂 ).


#Fail:

One approach that I had high hopes for didn’t pan out. Not sure why:

Filter attempt with search and relpace that didn't work