SEO Shows Google Results Can Be Hijacked

    November 16, 2012
    Chris Crum
    Comments are off for this post.

People have been claiming to see scrapers of their content showing up in Google search results over their own original content for ages. One SEO has pretty much proven that if you don’t take precautions, it might not be so hard for someone to hijack your search result by copying your content.

Have you ever had your search results hijacked? Scrapers ranking over your own original content? Let us know in the comments.

Dan Petrovic from Dejan SEO recently ran some interesting experiments, “hijacking” search results in Google with pages he copied from original sources (with the consent of the original sources). Last week, he posted an article about his findings, and shared four case studies, which included examples from MarketBizz, Dumb SEO Questions, ShopSafe and SEOmoz CEO Rand Fishkin’s blog. He shared some more thoughts about the whole thing with WebProNews.

First, a little more background on his experiments. “Google’s algorithm prevents duplicate content displaying in search results and everything is fine until you find yourself on the wrong end of the duplication scale,” Petrovic wrote in the intro to his article. “From time to time a larger, more authoritative site will overtake smaller websites’ position in the rankings for their own content.”

“When there are two identical documents on the web, Google will pick the one with higher PageRank and use it in results,” he added. “It will also forward any links from any perceived ’duplicate’ towards the selected ‘main’ document.”

In the MarketBizz case, he set up a subdomain on his own site, created a single page by copying the original HTML and images of the content he intended to hijack. The new page was +’ed and linked to from his blog. The page replaced the original one in the search results, thanks to a higher PageRank and a few days for Google to index the new page.

In the Dumb SEO Questions case, he tested whether authorship helped against a result being hijacked. Again, he copied the content and replicated it on a subdomain, but without copying any media. The next day, the original page was replaced with the new page in Google, with the original being deemed a duplicate. “This suggests that authorship did very little or nothing to stop this from happening,” wrote Petrovic.

In the Shop Safe case he created a subdomain, and replicated a page, but this time the page contained rel=”canonical”. The tag was stripped from the new page. The new page overtook the original in search, but it didn’t replace it when he used the info: command. +1’s had been removed after the hijack to see if the page would be restored, and several days later, the original page overtook the copy, Petrovic explained.

Finally, in the Rand Fishkin case, he set up a page in similar fashion, but this time, but “with a few minor edits (rel/prev, authorship, canonical)”. Petrovic managed to hijack a search result for Rand’s name and for one of his articles, but only in Australian searches. This experiment did not completely replace the original URL in Google’s index.

Rand Fishkin results

If you haven’t read Petrovic’s article the article, it would make sense to do so before reading this. The subject came up again this week at Search Engine Land.

“Google is giving exactly the right amount of weight to PageRank,” Petrovic tells WebProNews. “I feel they have a well-balanced algorithm with plenty of signals to utilise where appropriate. Naturally like with anything Google tries to be sparing of computing time and resources as well as storage so we sometimes see limitations. I assure you, they are not due to lack of ingenuity within Google’s research and engineering team. It’s more to do with resource management and implementation – practical issues.”

The Dumb SEO Questions example was interesting, particularly in light of recent domain-related algorithm changes Google has made public. In his findings, Petrovic had noted that a search for the exact match brand “Dumb SEO Questions” brought the correct results and not the newly created subdomain. He noted that this “potentially reveals domain/query match layer of Google’s algorithm in action.”

Petrovic believes there is still significant value to having an exact match domain. “Exact match domains were always a good idea when it comes to brands, it’s still a strong signal when it it’s a natural situation, and is now more valuable than ever since Google has sweeped up much of the EMD spam,” he says.

Here’s what industry analyst Todd Malicoat had to say on the subject in a recent interview.

Regarding the Fishkin experiment, Petrovic tells us, “Google’s perception of celebrity status or authority are just a layer in the algorithm cake. This means that if there is a strong enough reason Google will present an alternative version of a page to its users. There goes an idea that Wikipedia is hardcorded and shows for everything.”

When asked if freshness played a role in his experiments, he says, “Yes. Freshness was a useful element in my experiments, but not the key factor in the ‘overtake’ – it’s still the links or should I say ‘PageRank’. I know this surprised a lot of people who were downplaying PageRank for years and making it lame to talk about it in public.”

“This article was me saying ‘stop being ignorant,'” he says. “PageRank was and is a signal, why would you as an SEO professional ignore anything Google gives you for free? The funniest thing is that people abandon PageRank as a ridiculous metric and then go use MozRank or ACRank as an alternative, not realising that the two do pretty much the same thing, yet [are] inferior in comparison.”

“To be fair, both are catching up with real PageRank, especially with Majestic’s ‘Flow Metrics’ and the growing size of SEOMoz’s index,” he adds.

Petrovic had some advice for defending against potential hijackers: use rel=”canonical” on your pages, use authorship, use full URLs for internal links, and engage in content monitoring with services like CopyScape or Google Alerts, then act quickly and request removals.

He also wrote a follow up to the article where he talks more about “the peculiar way” Google Webmaster Tools handles document canonicalization.

So far, Google hasn’t weighed in on Petrovic’s findings.

What are your thoughts about Petrovic’s findings? Share them in the comments.

  • Mike

    Are people using practices like this? I work with vehicle registrations and when I search for cheap car insurance on google I get websites for hotels and plastic surgery. Also, I never see any of the big guys like Geico or Esurance. I don’t fully understand how hijacking works but I am trying to build a little website for my clients to visit. I know I am no big fish, but it’s discouraging to think I don’t stand a chance

  • http://seoenquirer.com SEOEnquirer

    HE followed up with this article showing you how you can get your competitors full backlink profile from webmaster tools just by stealing the content http://dejanseo.com.au/mind-blowing-hack/

    • http://www.webpronews.com/author/chris-crum Chris Crum

      This article has been expanded since it was originally published, and Petrovic’s follow-up article is now referenced.

  • http://www.techmero.com Krishna Parmar

    This is something that is really annoying, I have also seen this same case in my niche also. Well! Over optimizing the Google Algo can also harm the bloggers and online business.

  • http://www.multitopic.net Krishna Parmar

    This is not very good according to me, Google Algo should have strong range of robots to solve such un happening issues.

  • http://www.acleaningbid.com Thomas Anthony

    Thanks for the input and I agree. I would like to add that my hosting company contacted me explaining that I “was going to break the Internet” (hahaha) because I was using complete urls to load data or images using php. They explained how I was slowing down their servers by doing that. I reconfigured the code with limited urls and now my pages load faster and I guess the Internet won’t be broken anymore ( again hahaha).
    Thanks again,
    Thomas Anthony

  • http://www.karate-london.co.uk Rod

    Our twenty or so ‘keywords’ were C&P by another rival group; even our spelling mistakes were evident when looking at their list of keywords which was copied in the same order that we had them.

  • http://VerticalMarketing.Co?Miami-Web-Design Web Master

    Yes Google Results Can Be Hijacked but on the few occasions I have been involved with! it was actually a Joomla or Words Press website that got Hijacked via a js redirect script added as a Meta Tag item. The script looks like this [ if document.location.href = ‘http://yahoo.com’ then domain.com] – Here is a video of it on my FB page- Google Results Can Be Hijacked

  • http://www.enviroequipment.com Enviro Equipment Inc.

    So much for PageRank being unimportant. I only hope that Google finds out about this Dejan SEO study and makes appropriate changes to its algorithm to prevent this hijacking going on. I won’t hold my breath, though.

  • http://davidbillings.com Dave Billings – Air America Web

    Have you ever had your search results hijacked? Scrapers ranking over your own original content? “YES”

    This happened to one of my clients. Florida Patio Furniture Inc.
    http://www.casualtone.com Site was running 60,000 impressions and 2,000 clicks keyword “patio furniture”… Displayed locally page 1-2 and nationally page 4-5… Site listing was bumped off on the 26th of Sept. 2012… Google filtered keyword “furniture” and “manufacture” which were the primary site keyword combinations. Found a spammy links using primary front page content in copyscape and had numerous reports from other seo’s about the incoming spam links in Google forum. Site stats dropped to less than 700 impressions >10 clicks!
    Has been a nightmare. Figured competitor hired a black-hat seo to scrap the site. The only fix was to re-structure the entire site indexing and improve quality and kill off heavy weight organic seo. Still in the recovery process. Google’s response was no manual actions were taken! Sweet…

  • http://www.tipsinablog.com Danny

    Isn’t giving out such details just giving food for thought?

    Currently there seems to be a lot of this stuff going on, and comment spammers(and hackers) are working overtime….and many are actually subscribing to sites(newsletter, etc…….)…

    Also, I heard recently (wasn’t it Google)? that said, PR was no longer a BIG SE ranking factor, and it’s relevance has been watered down to blend in with the other 199 Google ranking marker factors!

  • Alan

    Sort of makes a nonsense of Google’s raison detre.

  • http://ebook-site.com Bryan Quinn

    Good article. Content from several of my websites have been copied and I often wondered about the search engine fluctuations.

    To be honest, I use Bing now whenever possible because I believe that Google is more interested in maximizing their shareholders wealth than providing the search results that people actually want.

  • http://www.xponex.com John Beagle

    Essentially nothing has changed except it’s easier now.

    My sites have been scraped for as long as I’ve had them. And yes, the scraped content often won in Google search results.

  • http://infonline.gr Chris

    I have faced a major problem with a travel related website. Almost 2 years ago, it was a newly established website and literally overnight lost rankings because of a content hijack from a competitor. The result was that we lost all the travel period that almost destroyed our efforts and the money we spent…

    • MJ


  • http://Mabuzi.com kevin

    Will there be a point where the Algo gets so complicated they forget what its for or you get zero search results.

    Some people have no morals.

  • http://www.qylovia.com/ qylovia

    I am confuse, cause newby

  • http://www.palaceofpooch.com Palace Of Pooch

    Very interesting article. Thank you

  • Casey

    “Google’s algorithm prevents duplicate content displaying in search results and everything is fine until you find yourself on the wrong end of the duplication scale,” – so I would suggest using duplicate content checker (like http://www.plagspotter.com/ ) to find if you have been copied.

  • Steve

    Google has an ethical obligation to ignore scraped content, as part of the DCMA (Digital Millennium Copyright Act) If you create original content, but you don’t know SEO practices, Google should be able to recognize where the content first appeared, and then safeguard the owner of the protected content against copy cats, regardless of the author’s knowledge of SEO practices?

    When they created spam laws “to protect buyers”, we should not get around that law by creating “web site spam”? It is this exact attitude by web operators that makes me hesitate to even publish, why should I bother if the people I expect to reward my work, will simply allow somebody to copy it and if they can market it better then me, I am the one to loose, because Google can’t stand the thought of loosing money today?

    Instead of rewarding the copy cats, can’t Google just send an email to the webmaster@my-domain.com, we estimate you are loosing $xxx.xx a day because you are not doing this “x, y, z”? A few days later, Google is making more money and they get the ethical win win?

    The only people that loose money by keeping SEO practices secret is Google (or anyone else for that matter), unless they want to pick who the winners and loosers are going to be, what other answer is there?

  • http://www.bestname.org Martin

    Shocked!! Thanks for enlightening us!