Google Teaches Robots Tool About Sitemaps

    August 16, 2007
    WebProNews Staff

The robots.txt analysis tool at Google Webmaster Central received some much-needed updating, and should be more effective for webmasters today.

Google Teaches Robots Tool About Sitemaps
Google Teaches Robots Tool About Sitemaps

Google followed up webmaster feedback about the need to upgrade the Webmaster Central robots.txt analysis tool with a new version. The latest update makes the tool capable of recognizing sitemap declarations and relative URLs.

The Webmaster Central blog announced the update and its implications. Webmasters can see if their sitemap’s URL and scope test as valid, Google said.

Expanded reporting options for the analysis tool look much more useful. It will tell if multiple problems per line of robots.txt exist, instead of just stopping with the first problem noticed. Google said they have also improved analysis and validation.

We’ve previously noted two new features added to the Robots Exclusion Protocol, and Google mentioned them again in their most recent Webmaster Central post.

The unavailable_after META tag allows webmasters to control the presentation of pages in search results. If a webmaster tags a page as being unavailable_after a given date, it will not be returned in response to a relevant query after that time.

Google also noted the new X-Robots-Tag directive for non-HTML content. Items like PDFs and videos can be controlled with unavailable_after just as web pages can.

Unavailable_after appears to be a response to companies that have complained of Google’s indexing and presentation of content that the content owners want to control. It provides a compromise between not indexing content at all, and having content available forever for searchers.

News organizations like AFP and Copiepresse have sued Google over its indexing practices, but declined to make robots.txt edits to keep the search engine out of their content. The page-level controls Google now recognizes with the unavailable_after and X-Robots-Tag directives should suit these groups, assuming they choose to use them.