Improve SEO with Google’s New Canonical Element

Contributing Editor

No Duplicate ContentAvoid duplicate content. Every good Search Engine Optimization expert will tell you that original content is the best way to succeed in your attempt to climb the slippery search engine slope. Copying content from elsewhere or syndicating the the same content to other sites can have an adverse effect your page rank.

Unfortunately, most websites inadvertently publish duplicate text. Consider your home page; is the same content available from different URLs?…

http://www.mysite.com/
http://www.mysite.com/index.html
http://www.mysite.com/index.php?sessionid=57

Few content management systems handle duplicate pages well. A URL rewrite engine, such as Apache’s mod_rewrite, can help but it is difficult to guarantee a unique addresses for every page. Google and the other search engines do their best to address non-malicious duplication, but you can never be certain that your website has not been downgraded.

The duplicate content problem has finally been resolved with Google’s new canonical element. Web developers can indicate their preferred page URL using a new <link> tag in the HTML <head>. For example:

<link rel="canonical" href="http://www.mysite.com/" />

Note:

  • The canonical URL must be on the same domain, although sub-domains such as www.mysite.com and products.mysite.com are permitted.
  • Relative path names are handled and Google will resolve URLs to any path set by the page’s <base> link.
  • The preferred URL does not need to contain an exact replica of the original page content. Google will permit slight differences, such as the order of a list of products. However, it is certainly advisable to avoid that situation where possible.
  • Google will follow canonical chains, but recommend that a single valid URL is specified for the best results.

Further information and instructions are available at the Google Webmaster Central Blog. The tag is being parsed by the GoogleBot now. Both Microsoft and Yahoo have agreed to the proposal; they are likely to follow shortly.

Have you experienced page rank problems owing to duplicate content within your site? Will the canonical element assist with your SEO effort?

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.ewriting.pamil-visions.com/ Mihaela Lica

    I think you meant duplicate content can have an adverse effect on SERPs rather than page rank. Other than that the information is gold :)

  • http://www.arwebdesign.net samanime

    This is great news. I plan to implement this immediately. I’m so glad they did it in a standards-compliant manner as well. When I read “canonical element” I was thinking and I was worried, but I think they figured out a great solution.

  • http://www.brothercake.com/ brothercake

    So Google now wants us to take responsibility for their inability to tell the difference between natural duplication and spamming? They expect us to provide the manpower that they’re not prepared to?

    This reminds me of the brief hysteria that arose around content designed for screenreaders – content that is invisible to screen users using offleft positioning, but still there in the source so that screenreaders hear it. To some this was a great concern that google would view it as spamming. And my reply to that was the same as to this – that’s google’s problem, not mine. If google penalises me for doing this then google is broken; period.

    Any company that would expect us to take this seriously has way, way to much power.

  • http://www.optimalworks.net/ Craig Buckler

    You have a valid point. Google should not penalise natural duplication, especially if it’s within the same domain. To be fair to Google, they try and cater for that situation.

    The canonical element could certainly help when you publish two or more versions of the same article, e.g. a normal version, a text-only version, and a mobile version. Whether Google realise those pages are inexact replicas is another matter.

    Does Google have too much power? Probably. Is the search engine broken? Nothing is ever perfect. However, you and your clients need a good position in search results. It’s difficult to ignore a known indexing problem especially when a workaround is available.

  • http://www.tyssendesign.com.au Tyssen

    However, you and your clients need a good position in search results.

    Yes, I don’t think too many clients are going to support you taking a moral stand if no-one visits their site because they’ve been penalised.

  • Anonymous

    So Google now wants us to take responsibility for their inability to tell the difference between natural duplication and spamming? They expect us to provide the manpower that they’re not prepared to?

    That seems like a bit of an overreaction!
    Through accidents of history and changing practices, most pages on my website have multiple URLs. In some cases, it’s because there are links from outside that have www. on them (although no internal links do), in many cases it’s because Google uses the old .htm extension rather than the current canonical .shtml extension, and in some cases it is because the page has been renamed. The old links always send users to the right page, but it would be very useful to be able to set a definitive link on each page so that Google is referencing the correct URL – that will help users, by ensuring that SERPs are not duplicated, and it will help me when analysing my logs.

    It is also very handy for anyone whose site generates print-friendly or text-only pages to ensure that search engines don’t send users to these pages, which won’t be as helpful for them as the ‘proper’ version of the page.

    I welcome this move – it is very little effort for web designers, and if it improves the way the search engine understands your website, what’s the problem with that?

  • http://www.mikehealy.com.au cranial-bore

    This brings to mind tags and categories within WordPress (or any blog/CMS with those features I guess), where a tag may be identical to a category name. The same content will be delivered if requested by clicking on a link from a tag cloud, as by choosing a category. This could be a reasonable solution to direct one such page to the other.

    I just need someone else to write a WP Plugin to do this :)

  • http://www.optimalworks.net/ Craig Buckler

    Amazingly, plugins have already appeared for WordPress, Drupal and Magento. Grab them from Yoast.com.

  • http://www.brothercake.com/ brothercake

    Yeah maybe I overreacted a little, on further reading it seems fairly reasonable.

    I just get mighty suspicious at anything that forces one group of people to take responsibility for another group of people’s problems, and in essence that’s what we have here. It’s all very well for google to play the “let’s all pull together for the good of the internet” card, but we know perfectly well that google isn’t concerned with the good of the internet, it’s concerned with its own interests, which may or may not co-incide.

  • http://www.optimalworks.net/ Craig Buckler

    Google’s tag might be reasonable, but what about the latest idea from Microsoft for Internet Explorer 8.0

  • shoebox

    So what’s better? A meta description rich url or one with a permanent id number in it?

  • http://www.jtresidder.com/ jtresidder

    @brothercake: It’s Microsoft’s fault that early IE doesn’t implement the box model correctly, etc. – should we make a stand against them too, or work around it to ensure that our clients’ sites render properly in both models?

    You’re right of course, but at the end of the day that’s irrelevant. The bottom line is that if you don’t avoid what Google considers to be duplicate content, your site will suffer.