Search Bots Behaving Badly

Get the WebProNews Newsletter:

[ Search]

In a classic bit of Internet history (1998/12/30), a newsgroup member explained the GoogleBot to a poster complaining of its rude behavior.

Timeout: Putting bots in a corner for bad behavior...
Timeout: Putting bots in a corner for bad behavior…

“It’s a legitimate research project. Unfortunately, it’s the worst written piece of crapware you’ll ever see crawling your website. It ignores robots.txt (well…. it repeatedly retrieves it, but ignores the contents), ignores robots metatags and headers, gets confused by infinite trees and will suck all your excess bandwidth for significant periods if you let it.

“It’s broken. Deny it access.


(tip from SEORoundTable)

Dan Thies, posting at the SEORoundTable, mentioned a forum thread that told of Microsoft’s bot acting in a similarly rude manner.

The forum post, by lundens, states, “the main issues is 175,000 hits and 3 Gig of data just to support msnbot in May thus far is adding up to more than I can afford to use.”

“If I were getting hammered like that,” said Dan, “I wouldn’t rely on robots.txt, I’d block every known MSNbot IP address.”

The MSNbot apparently got caught in an infinite loop in a dynamic application part of the poster’s website.

Have you noticed MSNbot exhibiting adolescent crawler behavior?

Garrett French is the editor of iEntry’s eBusiness channel. You can talk to him directly at WebProWorld, the eBusiness Community Forum.

Search Bots Behaving Badly
Comments Off
Top Rated White Papers and Resources

Comments are closed.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom