Quantcast

Canonical Tag Announced: Google’s Matt Cutts Interviewed

"Cesspool" of duplicate content cleaned up by Google, Yahoo and Microsoft search.

Get the WebProNews Newsletter:


[ Search]

Eric Schmidt once famously (or infamously, depending on how you look at it) called the Internet a cesspool. Now Google along with search rivals Yahoo and Microsoft are working together to clean up that cesspool to some extent.

Much of this "cesspool" comes from duplicate content, and a tag has now been revealed jointly from the three search engine giants that can give your pages the URL format that they all prefer. This is called the Canonical Tag and looks like this:

<link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish” />

As Google explains (see here for example, and FAQ), you can simply add this tag to specify your preferred version of a URL inside the <head> section of the duplicate content URLs:

http://www.example.com/product.php?item=swedish-fish&category=gummy-candy
http://www.example.com/product.php?item=swedish-fish&trackingid=1234&sessionid=5678

and the search engines will understand that the duplicates all refer to the canonical URL: http://www.example.com/product.php?item=swedish-fish. Additional URL properties, like PageRank and related signals, are transferred as well.

If you have unanswered questions about the canonical tag, now is the time to ask. Google will be happy to answer them. Yahoo and Microsoft have additional posts about the tag.

Canonical Tag Announced: Google’s Matt Cutts Interviewed
Top Rated White Papers and Resources
  • http://www.seosean.com SEO Service

    Wow, now that’s some news. I had no idea they were working on something like that. This will be a big help to SEOs and webmasters alike.

  • http://www.techknowl.com techknowl

    WOw great ! combined efforts from search engine giants would bring good quality for search results . Btw , will there any reverse effects for sites those who are not using the canonical tag ..

  • http://www.cmdmall.com Clint Dixon

    Wow Google I have been showing clients the error you could not understand, for over four years. I think its just easier now to do mod rewrite and skip involving Google…..

  • http://www.snerdey.com/templatemonster.html Snerdey

    This is great news as so many websites contain the exact same thing. Every word, image, link farms just so much junk!

    The Cesspool Police are coming.. it’s the new Man!

    I bet this action will remove 1,000,000,000 results ;)

    LOL

  • http://www.stepladdersuk.com Sid Bourn

    This will be gr8 for us webmasters if it helps with our rankings.
    S

  • http://www.lifeiscolourful.com/ Abhijeet

    I understand the importance of Canonical tags, but I am finding it difficult to apply to WordPress blogs automatically. The current method to do it manually is good for static websites but for wordpress blogs where the same content appears on the post page, tag pages, category pages and elsewhere that need to be sorted out, but how?

    • http://www.gpschildtracker.net GPS Child Tracker

      The solution to this is to make sure you only add the appropriate sections in your widget areas.

      So, what I do for example is to include ‘recent posts’, ‘categories’, ‘pages’ in the widget area only. Don’t include anything else.

      Then, I would recommend you install the plugin http://wordpress.org/extend/plugins/all-in-one-seo-pack/ where you can choose to let google index your archives, categories and tags or not.

      Take a look at one of my sites for an example: http://www.tahitianpearlrings.net

      Hope that helps.

  • http://www.hargate-hall.co.uk Peak District self catering

    Presumably using this tag means that scraper sites will actually be working in our favour? In the past they could dilute PR, but using this tag we will be getting PR from them.

  • bj

    As others have pointed out, though this may be okay in theory, I think it’s a nightmare to implement on dynamic sites, and most sites are dynamic these days.

    There’s another issue here. We, the webmasters, are not the ones indexing the web. I think Google should come up with something that allows them to figure it out on their own instead of pushing the responsibility on the webmaster yet again. THEY are the indexers, not us.

  • http://www.hotel-france-hotels.com/ Hotels in France Ltd

    This tag seems a good idea. However, it also seems to still rely on webmasters. If you are a scraper, or an affiliate with 10000 pages of repeated content that is making you a living, are you really expecting them to use this tag?

    Having spent the last 5 years writing unique content for my sites about my products I cant see the attraction. If my sites were all just repeat information and they were making me a living, I still would not be able to see it.

    Still it seems a good idea. I just hope it gets used.

  • Bluegrass

    This tag only works within a domain so it won’t help in any way with scrapers and other webscum. It simply allows the webmaster to identify the preferred url within their site not across domains.

  • http://www.greenteethmm/com/ boggart

    As usual Google’s solution is more complicated than it needs to be and therefore less effective than it ought to be.

    Given the information Google can gather on each of us it would be more simple to identify the first posting of a piece of content as the original and include in the algorithm a weighting to help lift it above duplicates.

    I am not involved in web business but can understand the frustrations of a content generator who sees their work being pirated. On the other hand content generators like myself who write for pleasure can have a following at several community sites so very often we duplicate our own content to reach more readers.

  • http://niche-traffic-sale.blogspot.com Ami

    Frankly I am puzzled by this post. I thought that Google was already dealing with the duplicate content thing by giving credit to the first version indexed.

    • Jim

      What if the scraper was indexed first? I don’t think they ever used the ‘first indexed’ system.

    • http://getforexhelp.com GetForexHelp

      Me too… I also was thinking that Google does recognize where the original content comes from until I caught up a page ranking on top of mine for a post published on my blog 3 days before this guy stole it… So what I did was writing him a clear message, and trying to complain with Google themselves, but guess what – impossible thing to do.

      So I ended up on one of Google’s forums and placed my complaint there, only to be told to go and post it on a different forum! I clearly stated on my forum post how upset I was with Google’s inability top recognize where content is coming from and how was it possible to rank a thief’s page above mine while the content clearly was stolen and posted on my blog 3 days before the guy did…

      The thief took the post off his blog and I don’t know if he did it himself, or Google did, as his blog is a blogger blog from Turkey. The blog is still there and only the page with my original content is off.
      Such grave mistakes in their SE is only showing how imperfect their search results are, and only God knows how they will be able to fix these… Jeez – talk about copied content, stolen content and whatnot…
      while we could continue searching for the thief manually while using phrases taken from our original content…

  • http://www.bestcellulitecreams.net Best Cellulite Creams

    I’ve always thought this duplicate content debate is a bit of a red herring and refer to Google’s own webmaster guidelines.

    It says under ‘Quality guidelines’:

    “Don’t create multiple pages, subdomains, or domains with substantially duplicate content.”

    Now, the key thing here is the word ‘substantially’. There is so much syndicated content on the web and many of these pages get indexed. As a test, find an unusual sentence in any articles and search for it in Google. You should find many returned with the top one being the one Google ‘respects’ the most.

    To me, the use of the word ‘substantially’ does not mean you can’t use existing content. The point is, if you can add ‘value’ to that content to enhance the user’s experience, ie. you provide more information, images, a video perhaps, a critique, etc. it is likely you could even end up outranking the original page itself.

    That clause in the guidelines does not mean you cannot use the work of others. It probably does mean you should not 100% duplicate it. But even Google does NOT say you can’t.

    Think about it.

    • http://www.TheSatinButton.com k4satin

      What you’re referring to is duplicate content on different domains, such as re-posting somebody’s article, press release or blog post. Your comments make sense for that context, but this tag is not about different domains.

      This is for url’s on the same domain that lead to the same content. For instance, if you put the same product under a couple of different categories on an ecommerce site, you may end up with different url’s, even thought the content is identical. Some shopping carts seem to handle this better than others, but Google is giving us a tool to let them know that the different pages can be looked at as one. It would be nice to see the shopping cart software developers integrate this tag so that the problem is resolved automatically.

  • http://www.byfchat.com jay

    Personally I love this new tag :)

  • William

    I take it that half the commenters can not read english.

    This tag is NOT to stop scrapers and other sites using syndicated, scraped, stolen, or otherwise duplicate content.

    It is simply to tell the engines which format of URL (when you have multiple URLs to the same thing, as is always the case with any rewritten URLs) you wish used per page in their results.

    It wll not help at all with any site but your own.

    And I have to agree with others here that Google has once again shown they do not have the ability to filter their own results in a meaningful way, and are wanting webmasters to change their pages for the search engines yet again. What happened to that old gem in the google guidelines about creating pages not for search engines, but only users, hmmmm?

  • Brett Dusek

    This is ridiculous. As a web designer, I shouldn’t have to do anything to help Google with their search processes. What ever happened to designing for the user not the search engine?

    • Guest

      Yes but if you want traffic to your site, you have to play the game, and google is the game master!

  • http://www.seofox.com David Ogletree

    All this thing is for is to help your own site fix problems with duplicate content because of poor programming and bad internal linking. Google is already doing this. They are a little tired of it and has given us a tag to help them out. It only affects your site nobody else. If you don’t have the tag Google will just keep doing what they always did. It is also a tag to let them know that an SEO is working on a site. It is better to fix your own website and not need this tag. The tag is useless if all your URL’s bring up unique content.

  • http://www.bopped.com/ Fab Four

    For publishers who are mainly publishing static html pages of original content this new tag is not relevant but it is good to get the information here as it breaks – another addition to my knowledge for when it’s needed – thanks Webpronews.

    • http://ripsychotherapy.com Mike Adamowicz

      Thanks. I was wondering how this applied to my static URLs. Glad I don’t have to deal with this.

  • http://www.searchen.com John Colascione

    This seems like it will be good for widely used software developers like vbulletin to intergrate into their forum software… This will help issues with showthread being “duplicated” by “printthread” as well as the forum archive which always has been thought to create duplicate content problems…

  • http://www.maximise.ie SEO Ireland

    The new tag makes a lot of sense, I had wondered if they would come up with a solution to the problem of multiple URLs for one page.

  • http://www.21stsoft.com web development company

    This assists web developers in their constant code writing solutions to correct this problem. I don’t see it as a search engine problem, but one brought forth by the size of the animal (Internet).

    Needed this for awhile…

  • http://www.canimakebigmoneyonline.com/ George

    It would be nice if the Search Engines would try to fix their own problems and stop asking webmasters to fix bugs in their algorithms. I understand why they are doing this but it seems like really lame solution to me.

  • http://searchenginemarketingnews.blogspot.com bavajan.seoppc

    Is this Canonical tag can be applicable to following urls too ???

    www.example.com/index.php
    example.com/index.php
    example.com/

    is the above urls can be transfered to www.example.com/ by placing conanical tag or not?

    I know this can be done by using 301 redirections, Here Now i want to know whether this simple conanical tag works for this also or not?

  • http://inchoo.net Toni Anicic

    It really doeas solve on-site duplicated content problem, however, we had other ways to make these sites unindexable (noindex, robots.txt and so on).

    The real revolution will be once Google finds a way to distinguish between original content on one domain name and duplicated on another. We still have no way to determine what article is original and what is stolen.

  • http://www.direito2.com.br Ruben Zevallos Jr.

    Hi,

    Can we add this also for pages that we collect from Internet? Like sobre references or other content?

  • http://www.latterkursus.dk Ejvind

    It sounded like a fgood idea for google to help us index our pages with duplicate content by giving us a tag. But as many have already stated here, there are far more serious issues – like who put the content up there first, and who is the original contributor.

    So thanks for taking a first step towards more originator friendly search results, as well as more user friendly search results.

    • http://www.diamondonnet.com Diamonds

      Yep, does not solve that more important issue.

  • http://www.qualityclotheslines.retractable.ntml Retractable clothesline

    Ha ha some how i think blondie was observing the yellow finch in tree while matt was talking. her response was a cover up, I JUST KNOW IT LOL.

  • http://www.lifeiscolourful.com/ Abhijeet

    Currently it’s interesting topic to look at, but would be more interesting to look at if all search engines can achieve what they wanted. Fingers crossed.

  • http://tinyurl.com/cq7ce5 john

    Hopefully this will solve a lot of duplicate content crawl errors and will help correct organic search results.

    • http://www.endai.com Endai Internet Marketing

      I see the organic search results suddenly tilting towards those mega sites with a huge catalogue of products. It’s already tough enough to out rank them with their sheer size in their favor.

      Now they can trim their unwieldy parameter filled strings into an optimized URL.

  • http://www.canonicaltag.com Bill

    This new tag is actually going to be very useful for a lot of sites. I know of a lot of ecommerce sites that offer products in different colors and those sites have essentially the same content but different pages for different colors: and the canonical tag will help.

  • http://www.hughzebeezlaughs.blogspot.com/ Hughze

    This hasn’t been a problem for me yet. It’s nice to know however.

  • http://www.sap.com/business_management_software/inventory_management.epx Inventory Management

    Has anyone tested this and seen results? We need to see how heavily this is being weighted before considering implementation.

  • http://adhd-npf.com/ Rakel

    very nice, now let’s see how it will solve our problem.

  • http://helpful-tips-for-seo.blogspot.com/ Peter Paul

    Canonical issue is not good for any website, and the information provided in this post on canonical issue is very helpful to us. Thanks

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom