iEntry 10th Anniversary RSS Newsletter Advertising
Visit Twellow.com
Text: Decrease Font Size Increase Font Size | Print Print Article | Share: Delicious Digg StumbleUpon Post to Twitter Post to Facebook
CommentWednesday, November 21, 2007

Unvalidated Robots.Txt Risks Google Banishment

The web crawling Googlebot may find a forgotten line in robots.txt that causes it to de-index a site from the search engine.
Unvalidated Robots.Txt Risks Google Banishment
Unvalidated Robots.Txt Risks Google Banishment
Webmasters welcome being dropped out of Google about as much as they enjoy flossing with barbed wire. Making it easier for Google to do that would be anathema to being a webmaster. Why willingly exclude one's site from Google?

That could happen with an unvalidated robots.txt file. Robots.txt allows webmasters to provide standing instructions to visiting spiders, which contributes to having a site indexed faster and more accurately.

Google has been considering new syntax to recognize within robots.txt. The Sebastians-Pamphlets blog said Google confirmed recognizing experimental syntax like Noindex in the robots.txt file.

This poses a danger to webmasters who have not validated their robots.txt. A line reading Noindex: / could lead to one's site being completely de-indexed.

The surname-less Sebastian recommended Google's robots.txt analyzer, part of Google's Webmaster Tools, and only using the Disallow, Allow, and Sitemaps crawler directives in the Googlebot section of robots.txt.

follow me on Twitter

Noindex:

Not to be a smart a**, but why would it be surprising if Google deindexed a site/page that include Noindex: in the robots.txt??

Why would you have it in there in the first place if you didn't want to be excluded?

Maybe I missed something?

Experimental robots.txt syntax

Thanks for the coverage Dave. :)

Joshua, folks not familiar with the REP syntax often leave garbage or experimental statements in robots.txt. As long as the crawlers ignore those, such forgotten stuff is not a big deal. It becomes risky when a search engine experiments itself, and the engine's interpretation doesn't match the webmaster's thoughts.

Publish A Comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
10 + 7 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
SEARCH
Popular WPN Business Resources












Subscribe to WebProNews


Send me relevant info