Quantcast

Content Scrapers Not a Big Deal?

Google Says It's Possible to Benefit From Them

Get the WebProNews Newsletter:
[ Search]

If you write for the web, whether that be on a blog or any other content site, there is a good chance your content has been scraped at some point, if not on a continuous basis. The good news is that it’s probably not that big of a deal. At least that is what Google’s Matt Cutts imples.

Answering user questions as he so often does, Cutts took on the question, "Is there a way to benefit from content scraped from your site?"

The simple answer to this is yes. You actually may be able to slightly benefit from having your content scraped. According to Cutts, if you make sure the pages on your site have links to you in them, the scrapers may leave the links in and end up linking to you. He says these links can "help you along."

"There are some people who really hate scrapers and try to crack down on them and try to get every single one deleted or kicked off their web host," says Cutts. "I tend to be the sort of person who doesn’t really worry about it, because the vast, vast, vast majority of the time, it’s going to be you that comes up, not the scraper. If the guy is scraping and scrapes the content that has a link to you, he’s linking to you, so worst case, it won’t hurt, but in some weird cases, it might actually help a little bit."

It’s the same principle that Cutts talked about when talking about having links in low-quality directories. He says Google tries not to score the low-quality directories too high, but it doesn’t hurt your site at all for being listed there.

He says that most of the time, you don’t really need to worry about scrapers, because they don’t have a large effect in terms of the actual impact on users very often. He does add that if you see a scraper ranking higher than you, you can consider doing a Digital Millennium Copyright Act request (DMCA), or if it’s a true spammer (gibberish, etc.) you can go ahead and do a spam report on them.

Content Scrapers Not a Big Deal?
Top Rated White Papers and Resources
  • http://2008taxes.org Steve

    I started posting everything to my blogs with links back to other articles on the blog and then posting a nice affiliate ad at the bottom for all the scrapers to paste on their site for me.

  • http://video.google.com/videoplay?docid=-6902379270204769711 West Coast Vinyl

    Scraping and getting a link back to its original web page is just good courtesy! Used for good to promote your window replacement industry, everyone benefits, including you.

    • http://www.3ac.co.uk Gary Taylor

      The benefit of getting a link back in acknowledgment just means that the scraper actually has a conscience about ripping off your work!

  • http://www.lexolutionit.com Maneet Puri

    What if the scraper is a smart one?

    I often have some websites steal content from my website and post it on theirs as their own. What I do is contact them personally and ask to remove it. As many times it has happened… It has been a peaceful affair.

  • http://texxsmith.com texxs

    Matt cutts is full of it! smaller sites, new sites , site with low page rank are easily beat in the se results by scrapers. I’ve seen it happen many times to my smaller clients.

    Of courtse he wouldn’t worry about it! I wonder if he’s ever been part of any small business? I wonder if he even knows anyone with a website that has a PR of 1 or less . . .

    Google sucks! Fight Google!

  • http://www.seosean.com/blog SEOsean Blog

    I know of another website that directly scraps all my blog posts but they at least include a link back to my site at the end of each post and keep all links in my orginal post. I was debating whether or not to go after them and get my content off their site but after hearing this from Matt I’ve decided to just let it go. I mean I am getting links back from this scraper and my site always shows up before their site in search results, so what the hay… I’ll just leave them be, thanks for the links!

  • http://www.seosean.com/blog SEOsean Blog

    I know of another website that directly scraps all my blog posts but they at least include a link back to my site at the end of each post and keep all links in my orginal post. I was debating whether or not to go after them and get my content off their site but after hearing this from Matt I’ve decided to just let it go. I mean I am getting links back from this scraper and my site always shows up before their site in search results, so what the hay… I’ll just leave them be, thanks for the links!!

  • http://www.green-trust.org Steve Spence

    I was getting scraped by an outfit in Asia, and they were not linking back or attributing where the content came from. I found a “copyright” plugin for WordPress that embedded a copyright notice and link back to me that they have not figured out how to disable. All is well, I have new incoming links.

    • http://blog.maskil.info/ Maskil

      Steve, please would you share the name/URL for the WP copyright plugin you mentioned?

      I

      • http://blog.maskil.info/ Maskil

        For anyone who might be interested, Steve makes use of the following WP plugin:

        Simple Feed Copyright: WordPress Plugin | Quick Online Tips
        http://www.quickonlinetips.com/archives/simple-feed-copyright-wordpress-plugin/

  • http://www.kalejia.com/ kalejia

    Matt cutts is full of it! smaller sites, new sites , site with low page rank are easily beat in the se results by scrapers. I’ve seen it happen many times to my smaller clients.

    Of courtse he wouldn’t worry about it! I wonder if he’s ever been part of any small business? I wonder if he even knows anyone with a website that has a PR of 1 or less . . .

    Google sucks! Fight Google!
    :L

  • http://wsipromarketing.com Internet Marketing Consultant MA

    Yes, this happens to me frequently but I also borrow content from others. The key is, borrow content, give credit back to the source via a backlink, and add to the conversation. Republished content can actually help spread the message. And that’s a good thing as long as it’s above board.

  • http://www.sitebyjames.com James

    Oh Sure… Let’s jump back to 1998 and scrape Google…

    http://web.archive.org/web/*/http://www.google.com

    Hey Google! Can we scrape you now?

    Hmmmnnn… it’s pretty easy to say it doesn’t matter when your a Multi Billion Dollar empire…

    I know… I’m going to build rel=nofollow scrapers and see who I can benefit…

    • http://www.sitebyjames.com James

      In any event… he is probably referring to the regionalism of the low quality directories… having a wide footprint never hurt anybody…

      Hey Google! Have you finished eating up the entire web? I am glad you have finally come clean regarding scraping. It is after all how you came up with your content to support your ad network…

      How’s that’s going anyways?

      hey let me be the first to thank your security team for going public with that kernel vulnerability…

  • http://investigativeresources.co.uk Marilyn Marion

    “Is there a way to benefit from content scraped from your site?”

    The simple answer, Cutts says to this is yes. You actually may be able to slightly benefit from having your content scraped. According to Cutts, if you make sure the pages on your site have links to you in them, the scrapers may leave the links in and end up linking to you. He says these links can “help you along.” END QUOTE – LOL!!

    I fail to see where this guy has a Clue and I totally agree with 99% of what the rest of you have said above herein so there is no point in my re-writing it. Scraping is NOT back-linking people.

    But when did STEALING start being called *scraping*? WOW.
    Is that like a Murderer being now called an *unfit citizen*?

    What scraping??? It is STEALING and if your own links are left in tact – it is a blunder on the thiefs part or they are just plain stupid! I am SO sick of it – I have had content I spent Hours creating only to have it stolen and have the Thief get better SE rankings!

    And further rank here – I stopped using google as an SE months ago, it is so full of DAS you have to generally go to the 3rd page to begin getting any *real* results – That upon crappy results at that. google – you have gone Way Down Hill.

  • http://www.agayomato.com Free acne info

    Scrapping other people’s webpage and posting on theirs treating as their own without due recognition to the original owner is just like “Broad day robbery”. It’s not a healthy outlook. Looks like internet which is an important “tool” to the world community has now created lots of plagiarism lately.

    • http://www.sitebyjames.com James Weisbrod

      Somebody should release a clone copy of the Google search interface as a part of the Joomla and WordPress source distribution and have it directly interface and scrape google results.

      They should add in some advertising functions and then every Joomla and WordPress owner can having bidding wars for that number one paid advertisement spot.

  • http://www.ibizdaily.com/ Anthony

    Cutts is obviously a smart guy – but there is a disconnect between him (an employee) and most of us out here reading the stuff and listening to him (entrepreneurs and small business people). He only has to go out and do his job, and theorize about how it affects us – while we sit here, day after day, working and creating, figuring things out, trying to get ahead, etc. and making our own paychecks. Then he comes along and says its no big deal. Its no big deal when billion dollar company is cutting your check. It is a big deal when some idiot steals something you have created.
    I mean he is probably great at what he does, but he dispenses advice too easily on some things where he just doesn’t get what a big deal they are to some of the people he is addressing.

    • http://www.sitebyjames.com James

      That’s an excellent point… Matt is certainly most likely a great guy… It’s just Google in general… They appear to live in a Utopian Environment. It’s sort of like Ivanna Trump returning from visiting a third world slum and suggesting to the rest of the world that we consume less…

  • Thomas

    Here’s a real-life example: We post a press release on PRWeb and a corresponding page on our site. A small number of legitimate resources will use our content just as we wanted them to – to talk about what we’re doing, give us free additional press, and yes, some cross linking mojo that hopefully Google likes. However, compared to those few good pages, I’m finding – on average – nearly 600 websites that scrape our content (exploiting a legally gray void left by the PRWeb terms, by the way) and use it to promote their own blackhat click networks or domain sales.

    That’s 600 bad to 1 (or a few) good.

    How does Google *not* see this or consider it a bad thing for their audiences, when there are over 600 other links competing with my original content on their SERPs? Or, does Google just ignore PR related content completely? If so, then what’s the value in cross-linking, if only blackhats get the benefit while the rest legit webmasters are ignored?

    Anyway, I sometimes wonder about the intelligence behind the algorythm…As someone above mentioned, it’s pretty easy to theorize about this inside the GooglePlex… but how much effort do they actually put into understanding the actual experiences of site owners who are, in fact, trying to do everything Google wants by giving them quality good content?

    Just my $0.02