Controlling How Your Site Is Indexed

Get the WebProNews Newsletter:

[ Search]

Over at the Google Blog, Dan Crow is publishing a series of how-to’s about controlling what the Googlebot says about your site through the Robots Exclusion Protocol. This is the cheap-and-easy version.

Don’t Index That Link!

Crow says if you don’t want the links on specific page indexed, you can use the NOINDEX tag for that page. If, however, doing this requires you to continually add NOINDEX and remove it, say for a continually updated and redirected news page, then you may want to go another way and save yourself some grief. If that news page is found through a gateway page, add a NOFOLLOW tag to the entry page instead so the Googlebot will stay put.

The code should look like this: <META NAME="ROBOTS" CONTENT="NOFOLLOW">

Don’t Say That!

Sometimes webmasters don’t want cached versions of webpages showing up in the search results, especially on the chance that the information is dated, or has been updated. Sometimes, webmasters don’t want the usually-important "snippet" displayed in the results either.

To keep the Googlebot from creating a cached version on Google’s servers, Crow says to use the NOARCHIVE tag.


For snippets, a NOSNIPPET tag is useful, and kills two birds at once. The NOSNIPPET tag automatically prevents archiving as well.


For the long version, click here. For the first robots.txt tutorial, click here.

Add to Del.icio.us | Digg | Reddit | Furl

Bookmark WebProNews:

Controlling How Your Site Is Indexed
Comments Off on Controlling How Your Site Is Indexed
Top Rated White Papers and Resources

Comments are closed.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom