SMX Day 1: Duplicate Content Summit

Get the WebProNews Newsletter:

[ Business]
Moderated by event organizer Danny Sullivan he gave a brief introduction of the panelists (Vanessa Fox (Google), Amit Kumar (Yahoo! Search), and Peter Linsley (Ask.com), and Eytan Seidman (Microsoft), like the geeks in the room needed to be reminded.

Beginning first was Eytan Seidman (Microsoft), the Lead Program Manager, who stressed on the fact duplicate content fragments your rank. Also, to be maintained is simplicity in session parameters. Duplicate content is okay for different locations if the content is unique. An important pointer was to ‘always use client-side redirects’.

When someone asked, “How do you avoid having people copy your content?”.

Seidman: All my experience is based on sites I helped administer. One thing is a simple method – tell people that if they use your content, they should attribute it to you. You can also block out types of crawlers, detect user agents, block unknown IP addresses from crawling.

Microsoft handles duplicate content via aggressively searching throughout for session parameters while also tracking parameters during crawl time.

Peter Linsley Ask.com’s Senior Product Manager for Search proposed using a copyright or even a creative commons notice to ward off duplicate content. Another of his pointer was to make content difficult to be molded any other way so that it maintains its uniqueness. if at all none of these work, take legal action.

Next to speak was Yahoo! Search’s Senior Engineering Manager, Amit Kumar. Yahoo extract links while crawling through sites but maintains a policy where they do not take content from pages they know are mere duplicates. However, Amit stated 4 reasons Yahoo! considered legit for duplication:

  1. Alternate document formats – PDF, printer friendly pages
  2. Legitimate syndication (newspaper sites have wire-service stories)
  3. Different languages
  4. Partial duplicate pages: navigation, common site elements, disclaimers.

Finally, it was Google’s very own Vanessa Fox who talked about duplicate content in the context of an episode from “Buffy the vampire slayer”. In one episode, there were two Xanders who had to be joint to end the problem. However, there was another episode where there were two Willows, this time the problem wasn’t the same as the previous one as in this situation, on Willow was good while the other Willow was evil, so the evil one had to go.

Ending the sessions was a whole lot of questions between the panel and invitees. Some of the most importan pointers are:

Peter: For the most part, a meta refresh is the same thing as a 301.

Vanessa: Use robots.txt to get rid of duplicate content

Vanessa: Tracking URLs and parameters is easy by linking to the canonical version to prevent dilution.

Though the Search Engine engineers were anti the suggestion most people attending were in favor of digital signatures to prevent content scraping.

Does someone have a picture of Matt McGee’s face when Vanessa mentioned Buffy the vampire slayer? Another interesting feature was that only Microsoft’s Etyan Seidman’s Powerpoint worked well.



SMX Day 1: Duplicate Content Summit
Comments Off
About Navneet Kaushal
Nav is the founder and CEO of PageTraffic, a premier search engine company known for its assured SEO service, web design and development, copywriting and full time SEO professionals.

Navneet has wide experience in natural search engine optimization, internet marketing and PPC campaigns. He is a prolific writer and his articles can be found in the "Best Articles" section of many websites and article banks. As a search engine analyst , he has over 9 years of experience and his knowledge is in application here. WebProNews Writer
Top Rated White Papers and Resources

Comments are closed.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom