<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>WebProNews &#187; Exclusion</title>
	<atom:link href="http://www.webpronews.com/tag/exclusion/feed" rel="self" type="application/rss+xml" />
	<link>http://www.webpronews.com</link>
	<description>Breaking News in Tech, Search, Social, &#38; Business</description>
	<lastBuildDate>Fri, 10 Feb 2012 15:09:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Google Flexes Robots Exclusion Protocol</title>
		<link>http://www.webpronews.com/google-flexes-robots-exclusion-protocol-2007-07</link>
		<comments>http://www.webpronews.com/google-flexes-robots-exclusion-protocol-2007-07#comments</comments>
		<pubDate>Fri, 27 Jul 2007 18:00:21 +0000</pubDate>
		<dc:creator>WebProNews Staff</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[Exclusion]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Protocol]]></category>
		<category><![CDATA[Robots]]></category>
		<category><![CDATA[Support]]></category>
		<category><![CDATA[Tag]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=39417</guid>
		<description><![CDATA[Two new features added to the protocol will help webmasters govern when an item should stop showing up in Google's web search, as well as providing some control over the indexing of other data types.
]]></description>
			<content:encoded><![CDATA[<p>Two new features added to the protocol will help webmasters govern when an item should stop showing up in Google&#8217;s web search, as well as providing some control over the indexing of other data types.<br />
<span id="more-39417"></span><br />
One of the features, <a href=http://www.webpronews.com/topnews/2007/07/13/unavailable-after-google-plans-new-meta-tag>support for the unavailable_after tag</a>, has been mentioned previously. Google&#8217;s Dan Crow made that initial disclosure.</p>
<p>
He has followed that up with a full-fledged post on the <a href=http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html>official Google blog</a> about the new tag. The unavailable_after META tag informs the Googlebot when a page should be removed from Google&#8217;s search results:</p>
<blockquote><p><i>This information is treated as a removal request: it will take about a day after the removal date passes for the page to disappear from the search results. We currently only support unavailable_after for Google web search results.</p>
<p>
After the removal, the page stops showing in Google search results but it is not removed from our system.</i></p></blockquote>
<p>Fully removing something from Google still requires the URL removal tool, found as one of Google&#8217;s Webmaster Central tools.</p>
<p>
Google also extended some control over assets beyond web pages to webmasters. Those who publish PDF, audio, video, or other file types can direct the crawler on how Google should manage access to them from its index.</p>
<p>
&#8220;We&#8217;ve extended our support for META tags so they can now be associated with any file,&#8221; said Crow. &#8220;Simply add any supported META tag to a new X-Robots-Tag directive in the HTTP Header used to serve the file.&#8221;</p>
<p>
Supported META tags include options like noarchive, nosnippet, noindex, and unavailable_after. Google sees these as offering enough flexibility to satisfy site publishers; we imagine they have organizations like AFP and Copiepresse in mind here.</p>
<p>
<small></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/google-flexes-robots-exclusion-protocol-2007-07/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Optimizing for Accidental Clicks</title>
		<link>http://www.webpronews.com/optimizing-for-accidental-clicks-2007-07</link>
		<comments>http://www.webpronews.com/optimizing-for-accidental-clicks-2007-07#comments</comments>
		<pubDate>Wed, 11 Jul 2007 15:01:25 +0000</pubDate>
		<dc:creator>Dan Sharp</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[Clicks]]></category>
		<category><![CDATA[comments]]></category>
		<category><![CDATA[Exclusion]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=39046</guid>
		<description><![CDATA[<div class="entry">I haven&#8217;t been to HotorNot.com for a long time, since the site first started and got a whole load of attention while I was at uni. I have read a couple of <a href="http://www.techcrunch.com/2007/07/09/get-a-little-bling-at-hotornot/" target="_blank" title="hot or not on techcrunch">articles of late</a> about how they have been forced to adapt to survive, so I thought I would take a look around the site.]]></description>
			<content:encoded><![CDATA[<div class="entry">I haven&rsquo;t been to HotorNot.com for a long time, since the site first started and got a whole load of attention while I was at uni. I have read a couple of <a href="http://www.techcrunch.com/2007/07/09/get-a-little-bling-at-hotornot/" target="_blank" title="hot or not on techcrunch">articles of late</a> about how they have been forced to adapt to survive, so I thought I would take a look around the site.<span id="more-39046"></span></p>
<p>The first thing I noticed is how damn close the adsense adverts are to the rating system. I accidently clicked on one while voting quickly. I wonder how many others have done the same?</p>
<div style="text-align: center;"><img src="http://images1.ientrymail.com/webpronews/articlepictures/hotornot.jpg" alt="hot or not adverts?" title="hot or not adverts?" /></div>
<p>Which brings about the question &#8211; Where should Google draw the line between adverts that honestly integrate as part of a website and those that go out of there way to optimise specifically for the accidental click&hellip;</p>
<p>How much are these ads pulling in? I would love to see the conversion rate on them. How many of those accidental clicks like mine are getting filtered and classed as invalid?</p>
<p>Would you trust your ads sitting there next to a voting system?!</p>
<p>From what I can see the ad placement is not breaking Googles terms and conditions. Some advertisers just need to use <a href="http://adwords.google.com/support/bin/answer.py?hl=en&amp;answer=13248" target="_blank" title="site exclusion">site exclusion</a>.</p>
</div>
<p><a title="Comment on optimzizing for accidental clicks" href="http://www.ppcblog.co.uk/google-adsense/optimising-for-accidental-clicks/#respond">Comments</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/optimizing-for-accidental-clicks-2007-07/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google To Share Content Partners</title>
		<link>http://www.webpronews.com/google-to-share-content-partners-2007-02</link>
		<comments>http://www.webpronews.com/google-to-share-content-partners-2007-02#comments</comments>
		<pubDate>Wed, 28 Feb 2007 01:30:23 +0000</pubDate>
		<dc:creator>Dan Sharp</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[comments]]></category>
		<category><![CDATA[content]]></category>
		<category><![CDATA[Delicious]]></category>
		<category><![CDATA[Digg]]></category>
		<category><![CDATA[Exclusion]]></category>
		<category><![CDATA[Forums]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Partners]]></category>
		<category><![CDATA[Reddit]]></category>
		<category><![CDATA[SEW]]></category>
		<category><![CDATA[Share]]></category>
		<category><![CDATA[Tool]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=35665</guid>
		<description><![CDATA[Big news today is that Google are planning on openly listing content sites where adverts display for Adwords advertisers. I have <a title="Google content transparency" href="http://www.ppcblog.co.uk/ppc/google-domain-parking-search-network-exclusion/">criticised Google</a> about this in the past, so its great news to hear that they will finally be introducing more transparency in their ppc network.]]></description>
			<content:encoded><![CDATA[<p>Big news today is that Google are planning on openly listing content sites where adverts display for Adwords advertisers. I have <a title="Google content transparency" href="http://www.ppcblog.co.uk/ppc/google-domain-parking-search-network-exclusion/">criticised Google</a> about this in the past, so its great news to hear that they will finally be introducing more transparency in their ppc network.</p>
<p>In an article over at the <a title="New York Times" href="http://www.nytimes.com/2007/02/26/business/media/26adco.html?pagewanted=2&amp;_r=2&amp;adxnnl=0&amp;ref=technology&amp;adxnnlx=1172486876-Q6GehMR+5LAIiU21EYoDYQ" onclick="javascript:urchinTracker('/outbound/www.nytimes.com');">New York Times</a>, Kim Malone Director of online sales and operations for Google Adsense spoke of the big change.</p>
<blockquote>
<p><em> In the next few months, Google&rsquo;s advertiser reports will begin listing the sites where each ad runs, Ms. Malone said. She added that advertisers on the Google networks would soon be able to bid on contextual ads on particular Web sites rather than simply buying keywords that appeared across Google&rsquo;s entire network.  Still, Ms. Malone said she did not see much of consequence coming from the changes. &ldquo;We don&rsquo;t expect a lot of demand for that placement targeting,&rdquo; she said. &ldquo;It&rsquo;s the brand, the display advertisers who care where they run.&rdquo;</em></p>
</blockquote>
<p>Well, I can see a LOT of consequences coming from these changes, with <a title="Google site exclusion tool" href="http://www.google.com/adwords/learningcenter/text/26070.html#26072" onclick="javascript:urchinTracker('/outbound/www.google.com');">Googles site exclusion tool</a> becoming more meaningful &amp; important to all advertisers&hellip;</p>
<p>More chatter over at the <a title="SEW Forums on Content Network URLS" href="http://forums.searchenginewatch.com/showthread.php?t=16385" onclick="javascript:urchinTracker('/outbound/forums.searchenginewatch.com');">SEW Forums</a>.</p>
<p><a href="http://www.ppcblog.co.uk/ppc/google-to-share-content-partners-finally/#respond">Comments</a>
</p>
<p>Tag:   </p>
<p>Add to <a class="printMailTop" onclick="window.open('http://del.icio.us/post?v=4&amp;partner=wpn&amp;noui&amp;jump=clos<br />
e&amp;url='+encodeURIComponent(location.href)+'&amp;title='+encodeURIComponent(docum<br />
ent.t  itle),'delicious','toolbar=no,width=700,height=400'); return false;" href="http://del.icio.us/post"><img border="0" src="http://images.ientrymail.com/webpronews/delicious-pic.png" alt="" /> Del.icio.us</a> | <a href="javascript:void<br />
window.open('http://digg.com/submit?phase=2&amp;url='+encodeURIComponent(window.<br />
location.href)+'&amp;ei=UTF-8','popup','width=520px,height=420px,status=0,locati<br />
on=0,resizable=1,scrollbars=1,left=100,top=50',0)"><img border="0" src="http://images.ientrymail.com/webpronews/digg-pic.png" alt="" /> Digg</a> | <a href="javascript:location.href='http://reddit.com/submit?url='+encodeURIComp<br />
onent(location.href)+'&amp;title='+encodeURIComponent(document.title)"><img border="0" src="http://images.ientrymail.com/webpronews/reddit.png" alt="" />Reddit</a> | <a href="javascript:location.href='http://www.furl.net/storeIt.jsp?u='+encodeUR<br />
IComponent(document.location.href)+'&amp;t='+encodeURIComponent(document.title)+<br />
'   '"><img border="0" src="http://images.ientrymail.com/webpronews/furl-pic.png" alt="" /> Furl</a>   </p>
<p>Bookmark WebProNews: <a href="http://www.webpronews.com"><img border="0" src="http://images.ientrymail.com/webpronews/wpn-readit.jpg" alt="" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/google-to-share-content-partners-2007-02/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google On Robots Exclusion Protocol</title>
		<link>http://www.webpronews.com/google-on-robots-exclusion-protocol-2007-02</link>
		<comments>http://www.webpronews.com/google-on-robots-exclusion-protocol-2007-02#comments</comments>
		<pubDate>Sat, 24 Feb 2007 03:59:36 +0000</pubDate>
		<dc:creator>Navneet Kaushal</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Exclusion]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Protocol]]></category>
		<category><![CDATA[Robots]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=35553</guid>
		<description><![CDATA[A post on official Google blog informs about Robots Exclusion Protocol. Sometime back we informed you about a previous post on <a href="http://www.unofficialseoblog.com/2007/01/27/control-the-indexing-and-accessing-of-your-sites-by-search-engines/">Robots.txt file</a>. <br />
]]></description>
			<content:encoded><![CDATA[<p>A post on official Google blog informs about Robots Exclusion Protocol. Sometime back we informed you about a previous post on <a href="http://www.unofficialseoblog.com/2007/01/27/control-the-indexing-and-accessing-of-your-sites-by-search-engines/">Robots.txt file</a>. </p>
<p>It imparted important details to the web publishers about how they can control indexing and accessing of sites by search engines and Google itself. The important tool for the same purpose is the robots.txt file. Robots.txt file gives powerful control to site owners on how the site is searched.</p>
<p>The more recent post on robots exclusion protocol provides more details and examples of mechanisms to control access and indexing of your website by Google. </p>
<p>This post simplifies the procedure of preventing Googlebot from following a link. &ldquo;Usually when the Googlebot finds a page, it reads all the links on that page and then fetches those pages and indexes them. This is the basic process by which Googlebot &quot;crawls&quot; the web. This is useful as it allows Google to include all the pages on your site, as long as they are linked together.&rdquo; It further says that one can add the NOFOLLOW tag to a&nbsp; page which tells the Googlebot not to follow any links it finds on that page.</p>
<p>Further on, the post intricately explains how to control caching and snippets. &ldquo;Usually you want Google to display both the snippet and the cached link. However, there are some cases where you might want to disable one or both of these. For example, say you were a newspaper publisher, and you have a page whose content changes several times a day. It may take longer than a day for us to reindex a page, so users may have access to a cached copy of the page that is not the same as the one currently on your site. In this case, you probably don&#8217;t want the cached link appearing in our results.&quot;</p>
<p>To know more on how robots exclusion protocol can assist read the <a href="http://googleblog.blogspot.com/2007/02/robots-exclusion-protocol.html" target="_blank" onclick="javascript:urchinTracker('/outbound/googleblog.blogspot.com');">complete post</a>.</p>
<p><a href="http://www.unofficialseoblog.com/2007/02/23/robots-exclusion-protocol-defined/#respond">Comments</a>
</p>
<p><a href="javascript:location.href='http://reddit.com/submit?url='+encodeURIComp<br />
onent(location.href)+'&amp;title='+encodeURIComponent(document.title)"><img border="0" src="http://images.ientrymail.com/webpronews/reddit.png" alt="" />Reddit</a> | <a href="javascript:location.href='http://www.furl.net/storeIt.jsp?u='+encodeUR<br />
IComponent(document.location.href)+'&amp;t='+encodeURIComponent(document.title)+<br />
'   '"><img border="0" src="http://images.ientrymail.com/webpronews/furl-pic.png" alt="" /> Furl</a>   </p>
<p>Bookmark WebProNews: <a href="http://www.webpronews.com"><img border="0" src="http://images.ientrymail.com/webpronews/wpn-readit.jpg" alt="" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/google-on-robots-exclusion-protocol-2007-02/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Disabling Google and Other Search Engines From Crawling a Site</title>
		<link>http://www.webpronews.com/disabling-google-and-other-search-engines-from-crawling-a-site-2004-03</link>
		<comments>http://www.webpronews.com/disabling-google-and-other-search-engines-from-crawling-a-site-2004-03#comments</comments>
		<pubDate>Wed, 31 Mar 2004 19:45:38 +0000</pubDate>
		<dc:creator>Shari Thurow</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Answers]]></category>
		<category><![CDATA[Exclusion]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Protocol]]></category>
		<category><![CDATA[Robots]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=9492</guid>
		<description><![CDATA[Reader question: I have a online database of horror movies, and I have a good Google rank. In my traffic logs I noted the last month a really growing of the bandwidth: one of the most important browsers of the server logs is Googlebot, so this traffic was generated for the spidering engine of Google. I have the 20 Gb bandwidth limit and I don't want to pay for excess, so I disable Google into my Web site. My question is:
]]></description>
			<content:encoded><![CDATA[<p>Reader question: I have a online database of horror movies, and I have a good Google rank. In my traffic logs I noted the last month a really growing of the bandwidth: one of the most important browsers of the server logs is Googlebot, so this traffic was generated for the spidering engine of Google. I have the 20 Gb bandwidth limit and I don&#8217;t want to pay for excess, so I disable Google into my Web site. My question is:</p>
<p>If I disable Google to my Web site, its possible Google.com erase or drop down my Web site for his directory?</p>
<p>Many thanks for your time and keep up the good work.</p>
<p>Answer:  Many thanks for posting this question because Web server issues and excluding robots are a very important aspect of search engine marketing (SEM). The reader did not specifically state how he kept Googlebot from spidering his site. I am assuming that the reader used the Robots Exclusion Protocol.</p>
<p><b>Robots Exclusion Protocol</b></p>
<p>The Robots Exclusion Protocol is a means of instructing robots (or spiders) from crawling a site. With the Robots Exclusion Protocol, Web site owners can instruct search engine spiders to not index individual Web pages, subdirectories, or even an entire site.  Instructions can also be tailored for individual search engines.</p>
<p>There are two types of robots exclusion: a meta tag or a text file.</p>
<p>To let Google know that you do not want a page crawled, you can create the following meta tag:</p>
<p>	<code>&lt;META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW"&gt;</code></p>
<p>To let all search engine spiders know that you do not want a page crawled, you can create the following meta tag:</p>
<p>	<code>&lt;META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"&gt;</code></p>
<p>For this tag to be effective on a whole site, you will have to place this tag on every page of your site.  This process can be quite boring and time consuming.  For that reason, I prefer to use the robots exclusion text file, commonly referred to as robots.txt, because it can easily be applied to an entire site.  </p>
<p>The robots.txt is a text file that you place on your server that instructs search engine spiders to NOT record the information in specified areas on your Web site, and not to follow the links on your Web site. In other words, text file lets the search engine spiders know which sections of your site are off limits.</p>
<p>I usually create my robots.txt files in NotePad (PC) or SimpleText (Mac).  But you can create simple text files in HTML software such as Dreamweaver.  </p>
<p>Google will request the robots.txt file before trying to index any page within your site. For example, if do not want Google to record any of the information on the site, type the following text into a text editor:</p>
<p><code>User-agent: Googlebot<br />
Disallow: / </code></p>
<p>Be sure to save the file as robots.txt. Do not use any other file extension.  If you save the file as a Word document and call it robots.doc, Google will ignore that file.</p>
<p><b>When search engines crawl to frequently</b></p>
<p>I understand the reader&#8217;s concern about bandwidth.  If Google or any search engine crawls a site too frequently, it takes up bandwidth. All of us pay for bandwidth.</p>
<p>However, when you instruct Google (or any search engine) to not crawl your site, you are essentially communicating, &#8220;Don&#8217;t show my Web pages in your search results.&#8221;  </p>
<p>I do not believe the reader&#8217;s intention was to exclude all of his Web pages from Google search engine results pages (SERPs). He just wants Google not to request pages from his server so often.</p>
<p>Google actually has a Web page with this information and an email address. This is a direct quote from Google&#8217;s Webmaster FAQs page:</p>
<p>&#8220;Please send an email to googlebot@google.com with the name of your site and a detailed description of the problem. Please also include a portion of the weblog that shows Google accesses, so we can track down the problem more quickly on our end.&#8221;</p>
<p>The URL for the information on this page is at <a href="http://www.google.com/webmasters/faq.html">http://www.google.com/webmasters/faq.html</a>.</p>
<p><b>When to use the Robots Exclusion Protocol</b></p>
<p>Some content is not important to site visitors and search engines, such as items in a CGI-BIN directory.  When your target audience searches for information, they are not interested in your site&#8217;s programs that generate your forms or your drop-down menus. They are not interested in a section of a Web site that is under construction.  They are not interested in redundant content, either. Using the Robots Exclusion Protocol ensures that unnecessary information is not shown in search results pages.</p>
<p>For more details about the Robots Exclusion Protocol, please visit: <a href="http://www.robotstxt.org/wc/faq.html">http://www.robotstxt.org/wc/faq.html</a>.</p>
<p>Shari Thurow is Marketing Director at Grantastic Designs, Inc., a full-service search engine marketing, web and graphic design firm.  This article is excerpted from her book, Search Engine Visibility (http://www.searchenginesbook.com) published in January 2003 by New Riders Publishing Co.  Shari can be reached at shari@grantasticdesigns.com.</p>
<p><a href="http://www.webpronews.com/ebusiness/asktheexperts/wpn-38-20030411ShariThurowAnswersSEOQuestions.html">Shari Thurow Answers SEO Questions: Click Here For Free Answers</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/disabling-google-and-other-search-engines-from-crawling-a-site-2004-03/feed</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using memcached
Database Caching 1/29 queries in 0.019 seconds using memcached
Object Caching 406/481 objects using memcached

Served from: webpronews.com @ 2012-02-10 10:23:21 -->
