Quantcast

Webmasters: Googlebot Caught in Spider Trap, Ignoring Robots.txt

Google Indexing Issues

Get the WebProNews Newsletter:


Webmasters: Googlebot Caught in Spider Trap, Ignoring Robots.txt
[ Search]

Sometimes webmasters set up a spider trap or crawler trap to catch spambots or other crawlers that waste their bandwidth. If some webmasters are right, Googlebot (Google’s crawler) seems to be having some issues here.

In the WebmasterWorld forum, member Starchild started a thread by saying, “I saw today that Googlebot got caught in a spider trap that it shouldn’t have as that dir is blocked via robots.txt. I know of at least one other person recently who this has also happened to. Why is GB ignoring robots?”

Another member suggested that Starchild was mistaken, as such claims have been made in the past, only to find that there were other issues at play.

Starchild responded, however, that it had been in place for “many months” with no changes. “Then I got a notification it was blocked (via the spidertrap notifier). Sure enough, it was. Upon double checking, Google webmaster tools reported a 403 forbidden error. IP was google. I whitelisted it, and Google webmaster tools then gave a success.”

Another ember, nippi, said they also got hit by it 4 months after setting up a spider trap, which was “working fine” until now.

“The link to the spider trap is rel=Nofollowed, the folder is banned in robot.txt. The spider trap works by banning by ip address, not user agent so its not caused by a faker – and of course robots.txt was setup up correctly and prior, it was in place days before the spider trap was turned on, and it’s run with no problems for months,” nippi added. “My logs show, it was the real google, from a real google ip address that ignored my robots.txt, ignored rel-nofollow and basically killed my site.”

We’ve reached out to Google for comment, and if and when we receive a response.

Meanwhile, Barry Schwartz is reporting that one site lost 60% of its traffic instantly, due to a bug in Google’s algorithm. He points to a Google Webmaster Help forum thread where Google’s Pierre Far said:

I reached out to a team internally and they identified an algorithm that is inadvertently negatively impacting your site and causing the traffic drop. They’re working on a fix which hopefully will be deployed soon.

Google’s Kaspar Szymanski comment on Schwartz’s post, “While we can not guarantee crawling, indexing or ranking of sites, I believe this case shows once again that our Google Help Forum is a great communication channel for webmasters.”

Webmasters: Googlebot Caught in Spider Trap, Ignoring Robots.txt


Top Rated White Papers and Resources
  • http://www.SaltLakeCityOkay.com Adsense Publisher

    Wow.

    So a publisher actually got Google to admit an algorithm was having a negative impact that was not intended to have one on a website and to announce they are going to try and come up with a fix for it. Google making a change to their algorithm for the sake of a single publisher seems like a manual correction to me.

  • AC

    Hi there, lets begin with few inspirational lines:
    PPT vs PPC
    PREPARE FOR THE SEARCH ENGINE REVOLUTION!
    EXPOSING SEARCH ENGINES CLICK SCAM.
    PAY FOR TIME IN POSITION not CLICKS!
    ONLY TIME IS MONEY, CLICK IS THE EVEL TRICK!
    PPT/PPP/PPT&P vs PPC
    PAY-PER-TIME / PAY-PER-POSITION / PAY-PER-TIME-IN-POSITION / PAY-PER-TIME-&-POSITION
    BIGGEST SCAM OF ALL TIMES: GOOGLE SOLD US ON “CLICKS” AND “AD NON-SENCE”
    LETS BAN PPC
    PPC IS THE ROOT OF ALL EVIL
    “ Take the course opposite to custom and you will almost always do well. ”
    — Jean Jacques Rousseau
    PPC is Google’s tax on small businesses under Wall Street protection.
    I repeatedly explain that less government/google’s PPC/”google’s tax on small businesses” on the web means more money left in the private sector, where it is more likely to create jobs and generate wealth.

    I suggest new and completely different solution, which can be used by all search engines. We need to stop search engine from keeping on stealing our hard earned advertiser’s money.
    The solution is very simple, as all right solutions:
    STOP PAYING FOR CLICKS and COUNTING THEM!
    LETS COUNT TIME OF BEING AT A SPOT and PAY ONLY FOR TIME.
    Every spot and time on search result page can have its value
    due to special location, and particular prime time.
    This will allow for an auction among the advertisers to get particular time or spot for a particular duration, knowing how many clicks this spot brings. Simple as that!
    AND THIS WILL ALMOST ELIMINATE THE CLICK FRAUDS!
    The only way and interest to create fraudulent clicks it will allow for is for search engines to raise value of a particular spot or prime time by forging number of clicks this spot and prime time brings. However, it will not bring us that much of a growing trouble, and search engines can be easily punished for those kinds of bad click tactics.
    Now they simply continue making their dirty money by letting us to click on each other ads, creating automatic click bots and capitalizing on it BIG TIME in billions. We can be leaving those billions in small business pockets and it’s up to us to vote for this in every blog possible. Lets spread the news.

  • http://www.juvelagems.com/ jeweller

    Do you think this has something to do with the panda update?

    • http://www.webpronews.com/ Chris Crum

      I can’t say with 100% certainty, but I would say not.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom