Quantcast

PubCon: Getting Rid Of Duplicate Content

Tips from Google

Get the WebProNews Newsletter:
[ Search]

The issue of duplicate content is something that all webmasters and site owners have to take into consideration and the PubCon session "Getting Rid of Duplicate Content Once and For All," addresses that challenge.

(Coverage of PubCon continues at WebProNews Videos.  Stay with WebProNews for continued coverage from the event this week.)

PubCon: Getting Rid Of Duplicate Content

Ben D’ Angelo, Software Engineer, Google, spoke about duplicate content issues. There are multiple disjoint situations including multiple URLs pointing to the same page, different countries with the same language, and syndicated content across other sites.

To avoid such issues you should have one URL for one piece of content. The reason for this is users don’t like duplicated results, it saves resources by having room to index other content, and it saves resources on your server.

Sources of duplicate content within your sites are multiple URLs pointing to the same page, www. Vs non www., session ids, URL parameters, and printable versions of your pages.

Google handles duplicate content in a number of ways. The general idea is to cluster pages and choose the best representative. Google uses different filters for different types of duplicate content. The goal is to serve on version of the content in the SERPs.

To prevent duplicate content there are a variety of things you can do.  For exact duplicates a 301 redirect is the best option.  For near duplicate content use noindex and robots.txt

For domains by country, different languages are not duplicate content. Use unique content specific to that country.  Use different TLDs and Webmaster tools for geo targeting.

 For URL parameters put data which does not effect the substance of the page in a cookie, not the URL.

When it comes to other sites include the original absolute URL in any syndicated content. Syndicate slightly different content. Manage your expectations if you use syndicated content, you will probably not outrank the original source.

Don’t be too concerned about scrapers or proxies, they generally won’t impact your rankings. If you are concerned you can file a DMCA or spam report with Google.

If you need other information you can visit Google Webmaster Central or the Google Webmaster Central Blog.
 

PubCon: Getting Rid Of Duplicate Content
Top Rated White Papers and Resources
  • http://www.malcolmcoles.co.uk/blog/wordpress-comment-pagination-and-duplicate-content/ duplicate content problems

    Comment pagination in the latest version of wordpress isn’t helping duplicte content – multiple pages with the same post but with just different comments at the bottom.

    • http://www.nailfunguscures.org Nail Fungus Cures

      I’m sure by the time you have lots of comments Google has already indexed your main page article anyway. But I know what you mean, if you configure WP incorrectly you can end up with the same content on posts, category pages, tag pages and then archives. I prefer to use the All in One SEO plugin to ensure that google does not index anything other than posts or pages.

      Take a look at one of my sites for an example: http://www.thefapturborobot.com/review

  • Guest

    In regards to the duplicate content part of this blog post, I personally use the http://www.copygator.com website to find and stop duplicate content:

    1. it’s automated and brings me results instead of me searching for duplicated content. All i had to do was submit my feed and it started monitoring my feed showing me who’s republished my articles on the web.

    2. i get notified by email so it contacts me when it finds copies of my articles online.

    3. i use their image badge feature to alert me directly on my website when my content is being lifted.

    4. it’s a free service as opposed the “per page” cost of copyscape/copysentry.