You have old or duplicate content. How can you tell Google which is the authoritative version?
There’s a subtle problem many businesses experience with URLs that can have an adverse impact on their SEO. It goes like this: for whatever reason (and there are many), they have multiple URLs for the same content. For instance, they may have URLs for:
- example.com
- www.example.com
- www.example.com/home
- https://m.example.com
- https://example.com?ref=twitter
Again, these URLs all contain the same information, but as far as Google can see, they’re different URLs. It may be tempting to delete some of them or set up 301 redirects, but the way you handle them can have side effects you may not have anticipated. After all, some of these URLs are receiving traffic and links, and if you tamper with them too much, it could result in a broken link or lost analytics data.
Fortunately, Google recognized the problem, and in 2009 they released a solution: canonical reference tags. Essentially, the canonical tag lets Google know which version is the most authoritative, and therefore which it should use in its search results. Let’s take a closer look.
Why not just use a 301 redirect?
There are many instances where using a 301 redirect is the correct choice. If you want a new page to completely overwrite an old page so that anyone who tries to reach the old page will be sent to the new one instead, then a 301 makes sense. A 301 redirect can also simplify the user experience. It may mean that links to your old content are still out there, but at least there’s no risk of that old link continuing to be spared around.
However, 301 redirects also make that content inaccessible, which may not be your intent. While it prevents links from breaking, it also overrides a user’s control. So, if you still want visitors to access that content, ref=canonical is a better bet.
Why wouldn’t I just delete the duplicate content?
First of all, in most cases the duplicate content exists because we created it intentionally. Maybe we wanted to use one URL for tracking purposes, or maybe we needed ways to canonicalize product listings. Canonical URLs are especially useful for ecommerce, where duplicate content is common, especially if the store offers products in multiple countries. The products may all be the same, but with different currencies.
Furthermore, deleting duplicate content can lead to broken links. Maybe someone pinned your product to their Pinterest board, emailed it to a friend, or shared it on their blog. By deleting that content rather than canonicalizing it, you lose out on that traffic.
Can’t I just tell Google not to follow it?
Some people try to avoid duplicate content issues by using the robots.txt tag to disallow Google from indexing the page. Unfortunately, this means that, in Google’s eyes, that page doesn’t exist. If that page generates traffic or links, Google won’t know and won’t give you any credit for it.
With a ref=canonical tag, any good SEO juice that goes to the other pages will count toward the overall ranking power of the canonical page. That’s a big win for everyone.
What about self-referential canonical URLs?
Here’s another piece of best-practice from Google: set up self-referencing canonical tags. This is a simple insurance policy both for underscoring to Google which page is, indeed, the most authoritative. Again, URLs can get messy, especially when you have plugins and various tracking mechanisms at play that could be auto-generating new URLs for special purposes. Help keep everything clear with canonical URLs.
What about old content?
Another reason you may not want to use a 301 redirect is if you have old content you still want users to access. We actually faced this situation recently. After publishing a post about Google’s newer, longer meta descriptions, we had to make a new posting within a few months to explain why they were short again.
This left us in an awkward pickle: We didn’t want to entirely erase the old blog post, but we also didn’t want anyone to be confused about the current situation. And, while adding an update notice at the top of the old post helped direct users to our new content, we didn’t want that old post potentially competing with our new post for Google’s search traffic.
The canonical tag let us effectively point Google in the right direction. Instead of two posts on a very similar topic competing with each other, we now have a way to say which one is the most useful version for the user.
There is some debate among SEOs over whether or not this is good practice, and there are absolutely ways this can be abused. If you over-think and over-use canonical reference tags, there’s always the chance Google will stop trusting your use of them.
But ultimately, we felt this solution was most respectful of searcher intent. Now, someone who does a search wondering why meta descriptions are short again will come to our new content rather than our old post about long meta descriptions, and if someone should still be googling for long meta descriptions, our canonical reference should guide them toward the more up-to-date version instead.
Canonical URLs establish the authority of one URL over another.
In short, while 301 redirects are something of a content override from a user perspective, canonicalization is a technical, Google-focused way of prioritizing content. Your users will barely notice, but it will help Google out a lot, and the results will have a bolstering effect on your SEO.