Fake Chrome OS Screenshots Punk Tech Media Mystery Blogger Comes Clean
![]() |
| Unvalidated Robots.Txt Risks Google Banishment |
That could happen with an unvalidated robots.txt file. Robots.txt allows webmasters to provide standing instructions to visiting spiders, which contributes to having a site indexed faster and more accurately.
Google has been considering new syntax to recognize within robots.txt. The Sebastians-Pamphlets blog said Google confirmed recognizing experimental syntax like Noindex in the robots.txt file.
This poses a danger to webmasters who have not validated their robots.txt. A line reading Noindex: / could lead to one's site being completely de-indexed.
The surname-less Sebastian recommended Google's robots.txt analyzer, part of Google's Webmaster Tools, and only using the Disallow, Allow, and Sitemaps crawler directives in the Googlebot section of robots.txt.
Fake Chrome OS Screenshots Punk Tech Media
3 Comments
Experimental robots.txt syntax
Thanks for the coverage Dave. :)
Joshua, folks not familiar with the REP syntax often leave garbage or experimental statements in robots.txt. As long as the crawlers ignore those, such forgotten stuff is not a big deal. It becomes risky when a search engine experiments itself, and the engine's interpretation doesn't match the webmaster's thoughts.
robots.txt
When you put noindex is is to be excluded, so this presents no threat. In fact it accomplishes your goal. It's those spiders which ignore your wishes that you need worry about.
Noindex:
Not to be a smart a**, but why would it be surprising if Google deindexed a site/page that include Noindex: in the robots.txt??
Why would you have it in there in the first place if you didn't want to be excluded?
Maybe I missed something?
Post new comment