Yahoo Gets Sectional With Robots.txt
Yahoo’s spiders will obey a new tag, called robots-nocontent, that will allow webmasters to discretely define content on a page they do not want to have indexed.
The microformat-inspired tag gives webmasters the ability to section off parts of a page unrelated to the main content. Items like navigation elements and headers that are not helpful to searchers won’t be used to find the page in search results.
Support for this starts this evening, according to Yahoo’s Priyank Garg on the Yahoo Search team. "We are rolling out an index update tonight for this change," he said (the promised post to the Yahoo Search blog has
not appeared yet.) "As usual, you’ll see some changes in ranking along with shuffling of the pages that are included in the index."
Garg also provided examples of how the new attribute can be applied to tags on a web page. Using a Yahoo Answers page as an example, he demonstrated how certain elements can be dubbed as ‘class=robots-nocontent’:
<div class="robots-nocontent">This is the navigational menu of the site and is common on all pages. It contains many terms and keywords not related to this site</div>
<span class="robots-nocontent">This is the site header that is present on all pages of the site and is not related to any particular page</span>
<p class="robots-nocontent">This is a boilerplate legal disclaimer required on each page of the site</p>
<div class="robots-nocontent">This is a section where ads are displayed on the page. Words that show up in ads may be entirely unrelated to the page contents</div>
Garg also addressed the question of cloaking as it pertains to the new attribute. "Using a "nocontent" tag to mark explicit sections of content is not considered "cloaking" because all of the content on the page is available protect the relevance of the results (unlike "cloaking" where we may be served content that is different from what users see)," he said.