Google Busts the Duplicate Content Myth
While Google’s Matt Cutts has certainly provided a wealth of helpful tips via the company’s Webmaster Central YouTube channel, he is not the only one to do so. Greg Grothaus of the Search Quality Team has posted a video (along with a presentation on the Webmaster Central Blog) covering duplicate content and multiple site issues that webmasters continue to face when trying to rank well in Google.
Greg begins by clearing up a popular myth about duplicate content, and that is that Google penalizes sites for having duplicate content. This is not the case. That’s not to say that duplicate content can’t have a negative impact on your rankings, but Google itself is not penalizing you for it.
Have you believed that Google penalizes sites for having duplicate content? Comment here.
Greg says people see messages like the one below and think their content is getting omitted from Google’s results, when in fact it really may just be being omitted for that particular query. Greg stresses that duplicate content is simply a factor on a "by query" basis.
"What’s actually happening, is that we’re looking at the query that the user’s doing, and we’re saying that we want diversity in the results we’re going to show a user," says Grothaus. He says those who think their content is being omitted because it is duplicate, will likely find that if they adjust their query to more specifically reflect the missing piece, they may just find that it shows up in results after all.
Google recognizes that most duplicate content is not created to be deceptive. There are of course exceptions, which are considered spam. Grothaus says even spam sites aren’t being penalized for having duplicate content though. They’re being penalized for being spam. Just like some spammers use bold tags, he says. They don’t penalize people just for using them. And they don’t penalize people just for having duplicate content.
The above list from Grothaus’s presentation shows examples of URLs that are different, but show the same content. Google will recognize that they’re the same, and will try to pick the right one, (although sometimes they pick the wrong one). Greg says Webmasters are the best people to know which one is best, so it helps to only use one.
You will not be penalized for using more than one, but there are some issues that can arise that may negatively affect your rankings. For one, your link popularity will be diluted. Backlinks pointing to several different URL versions of the same content, will make it harder to accumulate link juice for one URL. Greg says that user-unfriendly URLs in search results may offset branding efforts and decrease usability as well. Plus, with multiple versions of the same thing, Google will spend more time crawling the same content, meaning it will have less time to go deeper into your site, and you run the risk of having content not get indexed.
Fixing the Issues
To avoid such issues, Grothaus suggests using a "canonical" version of the URL, meaning the simplest, most significant form. He says to pick one for each page and link consistently within your site. You can also use the rel="canonical" link element as explained by Matt Cutts in the following clip:
Rules for rel="canonical"
There are rules for the rel="canonical" link element to consider. For one, it should be used between pages that are on the same domain. It works across different hosts. For example, blog.webpronews.com could suggest www.webpronews.com as a canonical URL, but it doesn’t work across domains. So www.webpronews.com couldn’t suggest www.smallbusinessnewz.com.
You can use the element for protocols, such as http:// vs. https://, and you can use it for ports. Pages don’t have to be identical, but they should be similar. Slight differences are ok. You don’t have to use the rel="canonical" link element. It is just another option, or "another tool in your arsenal," as Grothaus says.
Another option is to make all non-canonical URLs do a permanent (301) redirect to the canonical (or preferred) URL. In addition, in Google’s Webmaster Tools, you can specify www. vs. non-www. 301 redirects are commonly used when moving sites.
Lastly, Grothaus discusses multiple domains. This is in reference to when you have content for different audiences, such as by country, language, etc.
There are concerns here. You have to consider your reputation being distributed across multiple domains, and Google will only show what it perceives to be the best page for a particular query.
One interesting factor of this to also consider, that may often go overlooked, is that with multiple domains, you’re potentially losing the advantage Google’s tabbed user interface. You know how sometimes search results are expandable and point you to different links within the site? If your content is spread out across multiple domains, you may be missing extra clicks, because Google can’t link to another domain here.
Grothaus explains all of the above and elaborates on each point in the following fifteen -minute video. The information is based on his presentation from the recent Search Engine Strategies conference in San Jose.
See our own interview from SES with Grothaus here as well:
Did this information clear up any misconceptions you had about duplicate content? Let us know.