<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>WebProNews &#187; Bigdaddy</title>
	<atom:link href="http://www.webpronews.com/tag/bigdaddy/feed" rel="self" type="application/rss+xml" />
	<link>http://www.webpronews.com</link>
	<description>Breaking News in Tech, Search, Social, &#38; Business</description>
	<lastBuildDate>Mon, 13 Feb 2012 04:32:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Being a Bigdaddy Jagger Meister</title>
		<link>http://www.webpronews.com/being-a-bigdaddy-jagger-meister-2006-06</link>
		<comments>http://www.webpronews.com/being-a-bigdaddy-jagger-meister-2006-06#comments</comments>
		<pubDate>Fri, 09 Jun 2006 14:19:56 +0000</pubDate>
		<dc:creator>Jim Hedger</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[3]]></category>
		<category><![CDATA[Bigdaddy]]></category>
		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=29769</guid>
		<description><![CDATA[It took a little while to start to figure it out. Such things almost always do. After months of observation, research, discussion and debate, Search Engine Optimization experts appear to be getting a better handle on the effects of Google's Bigdaddy infrastructure upgrades.
]]></description>
			<content:encoded><![CDATA[<p>It took a little while to start to figure it out. Such things almost always do. After months of observation, research, discussion and debate, Search Engine Optimization experts appear to be getting a better handle on the effects of Google&#8217;s Bigdaddy infrastructure upgrades.</p>
<p>From mid-winter until this week, StepForth has strongly advised our clients to be conservative with any changes to their sites until enough time has passed for us, along with many others in the SEO community to observe, analyze and articulate our impressions of the upgrade. About ten days ago, the light at the end of the intellectual tunnel became eminently visible and SEO discussion forums are abuzz with productive and proactive conversations regarding how to deal with a post-Bigdaddy Google environment. </p>
<p>To make the long back-story short, in September 2005, Google began implementation of a three-part algorithm update that became known as the Jagger Update. Shortly after completing the algo update in late November, Google began an upgrading of their server and data storage network that was dubbed the Bigdaddy Infrastructure Upgrade. The Bigdaddy upgrade took several months to completely roll out across all of Google&#8217;s data centers, which are rumoured to number in the hundreds. </p>
<p>In other words, the world&#8217;s most popular search engine has, in one way or another, been in a constant state of flux since September. The only solid information SEOs had to pass on to curious clients amounted to time tested truisms about good content, links and site structure. Being the responsible sort we are, no good SEO wanted to say anything definite for fear of being downright wrong and misdirecting others. </p>
<p>Starting in the middle of May and increasing towards the end of the month, ideas and theories that had been thrown around SEO related forums and discussion groups started to solidify into the functional knowledge that makes up the intellectual inventory of good SEO firms.<br />
<a name="czar"></a><br />
Guided by the timely information leaks from Google&#8217;s                    quality control czar <A                    href="http://www.mattcutts.com/blog/indexing-timeline/" class="bluelink">Matt                    Cutts </A>, discussion in the SEO community surrounding                    Bigdaddy related issues has been led by members of forums <A                    href="http://www.webmasterworld.com/forum30/34228-1-15.htm" class="bluelink">WebMasterWorld                    </A>, SEW ( <A                    href="http://forums.searchenginewatch.com/showthread.php?t=11608&amp;page=1&amp;pp=20" class="bluelink">1                    </A>) ( <A                    href="http://forums.searchenginewatch.com/showthread.php?t=11407" class="bluelink">2                    </A>) ( <A                    href="http://forums.searchenginewatch.com/showthread.php?t=11686&amp;highlight=bigdaddy" class="bluelink">3                    </A>), <A                    href="http://www.threadwatch.org/search/node/bigdaddy" class="bluelink">Threadwatch                    </A>, SERoundtable ( <A                    href="http://forums.seroundtable.com/showthread.php?t=747" class="bluelink">1                    </A>) ( <A                    href="http://www.seroundtable.com/archives/003814.html" class="bluelink">2                    </A>) ( <A                    href="http://www.seroundtable.com/archives/003909.html" class="bluelink">3                    </A>), and Cre8asite ( <A                    href="http://www.cre8asiteforums.com/forums/index.php?showtopic=37184&amp;hl=bigdaddy" class="bluelink">1                    </A>) ( <A                    href="http://www.cre8asiteforums.com/forums/index.php?showtopic=34407&amp;hl=bigdaddy" class="bluelink">2                    </A>). SEO writers <A                    href="http://www.seobook.com/mt/mt-search.cgi?IncludeBlogs=1&amp;search=bigdaddy" class="bluelink">Aaron                    Wall </A>, <A                    href="http://www.textlinkbrokers.com/blogs/comments/388_0_1_0_C/" class="bluelink">Jarrod                    Hunt </A>, and <A                    href="http://www.site-reference.com/articles/Search-Engines/Understanding-Big-Daddy.html" class="bluelink">Mark                    Daoust </A>have also added their observations to the                    conversation in a number of separate articles.</p>
<p>By now, most good SEOs should be able to put their fingers on issues related to Bigdaddy fairly quickly and help work out a strategy for sites that were adversely affected by the upgrade. The first thing to note about the cumulative effects of Jagger and Bigdaddy is the intent of Google engineers to remove much of the poor quality or outright spammy commercial content that was clogging up their search results. </p>
<p>The intended targets of the Bigdaddy update go beyond sites that commit simple violations against Google&#8217;s webmaster guidelines to include affiliate sites, results gleaned from other search tools, duplicate content, poor quality sites and sites with obviously gamey link networks. In some cases, Google was targeting sites designed primarily to attract users to click on paid-search advertisements. </p>
<p>After the implementation of the algo update and infrastructure upgrades, SEOs have seen changes in the following areas: Site/Document Quality Scoring, Duplicate Content Filtering and Link Intention Analysis. </p>
<p>The first area noted is site or document quality scoring. Did you know there are now more web documents online than there are people on the planet? Many if not most of those documents are highly professional and some are sort of scrappy. While Google is not looking for perfection, it is trying to assess which pages are more useful than others and attention to quality design and content is one of the criteria. </p>
<p>Quality design simply means giving Google full access to all areas of the site webmasters want spidered. Smart site and directory structures tend to place spiderable information as high in the directory tree as possible. While Google is capable of spidering deeply into database sites, it appears to prefer to visit higher level directories much more frequently. We have noted that Google agent visits do tend to correspond with update times set via Google Sitemaps. </p>
<p>Accessibility and usability issues are thought to make up elements of the Jagger algorithm update, marking the way visitors use a site or document and the amount of time they spend engaged in a user-session associated with a site important factors in ranking and placement outcomes. Internal and outbound links should be placed with care in order to make navigating through and away from a site as easy as possible for site visitors. </p>
<p>Another quality design issue involves letting Google know which &#8220;version&#8221; of your site is the correct one. For most, a website can be access with or without typing the &#8220;www&#8221; part of the URL. (ie: <A                    href="http://www.stepforth.com/" class="bluelink">http://www.stepforth.com                    </A>, <A href="http://stepforth.com/" class="bluelink">http://stepforth.com                    </A>, <A                    href="http://www.stepforth.com/index.shtml" class="bluelink">http://www.stepforth.com/index.shtml                    </A>) This presents a rather funny problem for Google. Because links directed into a site might vary in the way they are written, sometimes it doesn&#8217;t know which &#8220;version&#8221; of a site is the correct one to keep in its cache. To Google, each of the variations of the URL above could be perceived as unique websites, an issue known as &#8220;canonicalization&#8221;, a subject <a href="http://www.mattcutts.com/blog/seo-advice-url-canonicalization/" class="bluelink">Matt Cutts addressed on his blog</a> in early January. </p>
<p>&#8216;Suppose you want your default URL to be http://www.example.com . You can make your webserver so that if someone requests http://example.com/ , it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.).&#8221;</p>
<p>Quality content is a bit harder to manage and a lot harder to define. Content is a word used to describe the stuff in or on a web document and could include everything from text and images, Flash files, audio or video and links. </p>
<p>There are two basic rules in regards to content. It should be there to inform and assist the site user&#8217;s experience and it should be, (in as much as possible), original. </p>
<p>Making an easy to use site that provides visitors with the information they are looking for is the responsibility of webmasters but there are a few simple ways to show you are serious about its presentation. </p>
<p>Focus on your topic and stick to it. Many of the sites and documents that have found themselves demoted or de-listed during the Bigdaddy upgrade were sites that delved across several topics at the same time without presenting a clear theme. Given the option between documents with clear themes and documents without clear themes, Google&#8217;s choice is obvious. </p>
<p>Google is working to weed out duplicate content. Google appears to be looking for incidents of duplicate content in order to try to promote the originator of that content over the replications. This has hit sites in the vertical search sector; affiliate marketing sector, real estate sites, and even retail sites that carry brand name products, especially hard. Several shopping or product focused database sites have seen hundreds or even thousands of pages falling out of Google&#8217;s main index. </p>
<p>In many cases, there is little or nothing to do for this except to start writing unique content for products listed in the databases of such sites. Many real estate sites, for example, use the same local information sources as their competitors do and all tend to draw content from the same selection of MLS listings. It&#8217;s not that Google thinks this content is &#8220;useless&#8221;, it&#8217;s that Google already has several other examples of the same content and is not interested in displaying duplicate listings. </p>
<p>Many of the listings previously enjoyed by large database driven sites have fallen into a Google index known as the supplemental listings database. Supplemental listings are introduced to the general listings shown to Google users when there are no better examples to choose from to meet the users search query. This is the same index that is often referred to as the &#8220;Google Sandbox&#8221;. </p>
<p>The last major element noted in the discussions surrounding Bigdaddy is how much more robust Google&#8217;s link analysis has become. Aside from site quality and duplicate content issues, most webmasters will find answers to riddles posed by Bigdaddy in their network of links. </p>
<p>In order to ferret out the intent of webmasters, Google has increased the importance of links, both inbound and outbound. Before the updates, an overused tactic for strong placement at Google saw webmasters trying to bulk up on incoming links from where ever they could. This practice saw the rise of link farms, link exchanges and poorly planned reciprocal link networks. </p>
<p>One of the ways Google tries to judge the intent of webmasters is by mapping the network of incoming and outgoing links associated with a domain. Links, according to the gospel of Google, should exist only as navigation or information references that benefit the site visitor. Google examines the outbound links from a page or document and compares them against its list of inbound links, checking to see how many match up and how many are directed towards, and/or coming from pages featuring unrelated or irrelevant content. Links pointing to or from irrelevant content or reciprocal links between topically unrelated sites are easily spotted and their value to the overall site ranking downgraded or even eliminated. </p>
<p>The subject of links brings up an uncomfortable flaw in Google&#8217;s inbound link analysis that is being referred to as Google Bowling. As part of its scan of the network of links associated with a document or URL, Google keeps a detailed record of who links to who, how long the link has been established, if there is a recip link back, along with several other items. </p>
<p>One of those items appears to be an examination of how and why webmasters might purchase links from another site. While bought-links are not technically a rankings killer, a bulk of such links purchased from un-relevant sites in a short period of time, can effectively destroy a site or document&#8217;s current or potential rankings. In an article published at WebProNews last autumn, <a href="http://www.webpronews.com/expertarticles/expertarticles/wpn-62-20051027GoogleBowlingHowCompetitorsCanSabotageYouWhatGoogleShouldDoAboutIt.html" class="bluelink">Michael Perdone</a> from e-TrafficJams speculates on the issue. </p>
<p>Google has tried to deal with the predatory practice of &#8220;Google Bowling&#8221; by considering the behaviour of webmasters whose site have seen a number of inbound links from &#8220;bad neighbourhoods&#8221; suddenly appear. If a site that has incoming links from bad places also has or creates links directed out-bound to bad places, the incoming links are judged more harshly. If, on the other hand, a website has a sudden influx of bad-neighbourhood links but does not contain outbound-links directed to bad places, the inbound ones might not be judged as harshly. </p>
<p>The combination of the Bigdaddy upgrade and the Jagger algorithm update have made Google a better search engine and are precursors to the integration of video content and other information pulled from other Google services such as Google Base, Google Maps and Google Groups in the general search results. </p>
<p>Before the completion of both, Google&#8217;s search results were increasingly displaying a number of poor quality results stemming from a legion of scraped content &#8220;splog&#8221; sites and phoney directories that had sprung up in efforts to exploit the AdSense payment system. Bigdaddy and Jagger is a combined effort to offer improved, more accurate rankings while at the same time, expanding Google&#8217;s ability to draw and distribute content across its multiple arms and networks. </p>
<p>Moving forward, that is what users should expect Google to do. Google is no longer a set of static results updated on a timed schedule. It is constantly updating and rethinking its rankings, especially in light of the number of people trying to use those rankings for their own commercial gain. </p>
<p>The effect of the duo upgrades seems to be settling out. Credible, informative sites should have nothing to worry about in the post-Bigdaddy environment. As Google is trying to move into the most mainstream areas of modern marketing, credibility is its chief concern. The greatest threat to Google&#8217;s dominance does not come from other tech firms. It comes from the users themselves who, if displeased with results being shown by Google, could migrate en masse to another search engine.</p>
<p> |  </p>
<p>Add to <script language='javascript'>document.write("<a href='http://del.icio.us/post?url="+encodeURIComponent(document.location.href)+"&#038;title="+encodeURIComponent(document.title)+"'>Del.icio.us</a>")</script> | <a href="javascript:voidwindow.open('http://digg.com/submit?phase=2&#038;url='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">DiggThis</a> | <a href="javascript:voidwindow.open('http://myweb2.search.yahoo.com/myresults/bookmarklet?t='+encodeURIComponent(document.title)+'&#038;u='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">Yahoo My Web</a></p>
<p><script language=JavaScript src="http://aj.600z.com/aj/1095/0/vj?z=1&#038;dim=1088&#038;pos=15"></script></p>
<p>Jim Hedger is the SEO Manager of <a href="http://www.Stepforth.com/">StepForth Search Engine Placement Inc.</a> Based in Victoria, BC, Canada, StepForth is the result of the consolidation of BraveArt Website Management, Promotion Experts, and Phoenix Creative Works, and has provided professional search engine placement and management services since 1997. http://www.stepforth.com/  Tel &#8211; 250-385-1190  Toll Free &#8211; 877-385-5526  Fax &#8211; 250-385-1198</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/being-a-bigdaddy-jagger-meister-2006-06/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bigdaddy Timeline, Courtesy of Matt Cutts</title>
		<link>http://www.webpronews.com/bigdaddy-timeline-courtesy-of-matt-cutts-2006-05</link>
		<comments>http://www.webpronews.com/bigdaddy-timeline-courtesy-of-matt-cutts-2006-05#comments</comments>
		<pubDate>Thu, 18 May 2006 17:17:07 +0000</pubDate>
		<dc:creator>Jim Hedger</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Bigdaddy]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[index]]></category>
		<category><![CDATA[Indexing]]></category>
		<category><![CDATA[Quality]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=29359</guid>
		<description><![CDATA[Sometime around January or February, a number of webmasters began to notice that Google had somehow "lost" huge portions of their websites.
]]></description>
			<content:encoded><![CDATA[<p>Sometime around January or February, a number of webmasters began to notice that Google had somehow &#8220;lost&#8221; huge portions of their websites.</p>
<p>Reference to their sites, generally to the index pages and a seemingly random selection of internal pages existed in Google listings but pages that once drove sizable amounts of traffic appeared to vanish into the ether. As February rolled into March, more reports were posted to blogs and forums by frustrated webmasters who started to notice the number of pages from their sites had declined, significantly, in Google&#8217;s index.</p>
<p>Many SEO firms, including StepForth, received information requests and research projects from clients who wanted to know what had happened to their sites. In all cases, we did the best we could but, given the obvious complexity of the update and the lack of fresh information from Google, recommendations given during this period have more resembled shotgun style SEO advice than the finer laser focus most of us would normally prefer to offer our clients. As is the case with most major updates, investigation as often as not leads to more questions.</p>
<p>Matt Cutts, Google&#8217;s Search Quality Officer and #1 communicator, answered many of those questions yesterday in an open and wide ranging post titled, &#8221; <a href="http://www.mattcutts.com/blog/indexing-timeline/" class="bluelink">Indexing Timeline</a> &#8220;.</p>
<p>The post outlines how Google staff have examined and responded to webmasters&#8217; queries and complaints stemming from the Bigdaddy update. It also addresses a number of issues webmasters who have seen sections of their pages disappear from the SERPs including the quality of both in-bound and out-bound links, irrelevant reciprocal linking schemes, and duplicate text found on vertical reference and affiliate sites. </p>
<p>According to his timeline, on March 13, Googleguy asked webmasters to offer example sites for Google&#8217;s analysis in a post at <a href="http://www.webmasterworld.com/forum30/33595.htm" class="bluelink">WebmasterWorld</a> . Commenting on the sites offered up for examination Cutts wrote,</p>
<p>&#8220;After looking at the example sites, I could tell the issue in a few minutes. The sites that fit &#8220;no pages in Bigdaddy&#8221; criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling. The Bigdaddy update is independent of our supplemental results, so when Bigdaddy didn&#8217;t select pages from a site, that would expose more supplemental results for a site.&#8221;</p>
<p>That quote covers a lot of ground but it explains a great deal of Google&#8217;s post-Bigdaddy behaviour.</p>
<p>Google bases its ranking algorithm on trust. That might sound nave to the uninformed, but we are discussing one of the most informed electronic entities that has ever existed. Google also keeps <a href="http://news.stepforth.com/whitepaper/google-patent-may05/index.php" class="bluelink">historic records</a> on every item contained in its index. Though it bases its opinions on a baseline of trust, those opinions are extremely well informed.</p>
<p>In order to remain continually informed, it spiders everything it can and sorts the data later. Google maintains a massive number of indexes including one known as the <a href="http://www.google.com/support/bin/answer.py?answer=12286" class="bluelink">supplemental index</a> . The supplemental index is a much larger representation of documents found on the web than those included in the main Google index.</p>
<p>&#8221; We&#8217;re able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.&#8221; (source: <a href="http://www.google.com/support/bin/answer.py?answer=12286" class="bluelink">Google Help Center</a> )</p>
<p>Many of the results that appeared to have disappeared are assumed to have been drawn from the supplemental results before the update. &#8221; A supplemental result is just like a regular web result, except that it&#8217;s pulled from our supplemental index&#8221;.</p>
<p>As Cutts is quoted saying above, Bigdaddy results are separate from supplemental results. When a reference to a site is found in the main (Bigdaddy) results, Google does not necessarily dip into supplemental results as often as it might have previously.</p>
<p><b>Quality On, Quality In and Quality Out </b></p>
<p>Google has gotten better at judging the quality of content found on a document and within a site. Content includes text, images, titles, tags and both inbound and outbound links. Consistently said that well-built sites offering quality information and a positive user experience should perform well throughout its search indexes, Google provides a wealth of information via the Google Help Center and through its webmaster focused spokespersons, Cutts and Googleguy.</p>
<p>As Google has gotten better at determining the origin and history of content found in its various indexes, it tries to snip away at duplicate forms of on-site content, with the goal of listing the most trust worthy sites under any given user query in the main index.</p>
<p>Having been inundated over the years with multiple replications of what was already considered duplicate content. Google (and other search engines) has gotten very good at knowing if it has already indexed similar or duplicate content. Google is capable of examining text (including individual paragraphs), images and link networks (in and outbound links), looking for telltale signs of duplicate content.</p>
<p>If, for example, it perceives a site displaying product information pulled from the same product database that 25,000 other sites pull duplicate product information from, Google is not likely to rank that site well. Similarly, if it finds duplicate networks of reciprocal links shared among several pages in its index, it is not likely to assign a high trust value to that document.</p>
<p><b>Reciprocal linking strategies </b></p>
<p>&#8220;As these indexing changes have rolled out, we&#8217;ve improving how we handle reciprocal link exchanges and link buying/selling.&#8221;</p>
<p>Though Cutts points at reciprocal linking as an indicator to Google that there might be issues with a website&#8217;s credibility, that doesn&#8217;t automatically mean that all reciprocal links are going to cause problems for webmasters. Common sense and the value of delivering a quality user experience should dictate decisions around link strategies. </p>
<p>For example, if a professional landscaper provided links to plant nurseries in his or her region, and those nurseries in turn provided links to that landscaper, Google would likely consider those to be quality links. There is a direct relevance between the two sources of information. A network of links between local landscaping businesses, nurseries, horticultural institutes, <a href="http://www.google.com/search?hl=en&#038;lr=&#038;safe=off&#038;c2coff=1&#038;q=define%3A+permaculture&#038;btnG=Search" class="bluelink">permaculture</a> initiatives, non-profit volunteer groups and a number of gardening centers, shared amongst a relevant set of websites would also likely be judged beneficial to Google users and not subject to supplemental penalization.</p>
<p>On the other hand, a network of obviously purchased links between anyone who will exchange links with each other, regardless of relevancy or direct user benefit is likely to trip any number of filters present in the Bigdaddy/Jagger upgrades.</p>
<p>Cutts provided an example of a simple error made by a real estate site. Along with a number of internal reference links to exotic properties displayed as a footer-style site-map, Cutts found several out-bound links with anchor text reading;<br />
1-exersize-equiptment.com, Credit Cards, Quit Smoking Forum, Hair Care, and GoSearchFor.com. When he reset, he saw a similar set of links, only this time, the out-bound links were directed towards, mortgages sites, credit card sites, and exercise equipment. Cutts commented, &#8220;&#8230;if you were getting crawled more before and you&#8217;re trading a bunch of reciprocal links, don&#8217;t be surprised if the new crawler has different crawl priorities and doesn&#8217;t crawl as much. &#8221; </p>
<p><b>Affiliate Text and Content </b></p>
<p>Cutts devoted a long paragraph covering affiliate text, mentioning a T-shirt site that once had about 100 pages indexed, a number recently reduced to only 5.</p>
<p>&#8221; The person said that every page has original content, but every link that I clicked was an affiliate link that went to the site that actually sold the T-shirts. And the snippet of text that I happened to grab was also taken from the site that actually sold the T-shirts. The site has a blog, which I&#8217;d normally recommend as a good way to get links, but every link on the blog is just an affiliate link. The first several posts didn&#8217;t even have any text, and when I found an entry that did, it was copied from somewhere else. So I don&#8217;t think that the drop in indexed pages for this domain necessarily points to an issue on Google&#8217;s side. The question I&#8217;d be asking is why anyone would choose your &#8220;favourites&#8221; site instead of going directly to the site that sells T-shirts?&#8221;</p>
<p><b>The Ghosts of minutes past </b></p>
<p>We live in the present. Our websites live in the past as well as the present. Google keeps tabs on all documents in its index and even if it has, &#8220;&#8230; spidered content that was posted only moments before,&#8221; it has an elephant&#8217;s memory for previous details and a computer&#8217;s ability to pull lots of information together to get a bigger picture of how all those details fit together.</p>
<p>Google works by following links. Google ranks by examining the quality of content found on a site and also on the sites that link into, or are linked to from, sites in its indexes. If you have seen a great deal of page content fall away from Google&#8217;s index, or if you are just generally interested in how Google is working, read Cutts&#8217; Bigdaddy &#8220;<a href="http://www.mattcutts.com/blog/indexing-timeline/" class="bluelink"> Indexing Timeline</a> &#8220;. </p>
<p>Add to <script language='javascript'> document.write("<a href='http://del.icio.us/post?url="+encodeURIComponent(document.location.href)+"&#038;title="+encodeURIComponent(document.title)+"'>Del.icio.us</a>")</script> | <a href="javascript:void window.open('http://digg.com/submit?phase=2&#038;url='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">DiggThis</a>  | <a href="javascript:void window.open('http://myweb2.search.yahoo.com/myresults/bookmarklet?t='+encodeURIComponent(document.title)+'&#038;u='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">Yahoo! My Web</a></p>
<p>Technorati: </p>
<p>Jim Hedger is the SEO Manager of <a href="http://www.Stepforth.com/">StepForth Search Engine Placement Inc.</a> Based in Victoria, BC, Canada, StepForth is the result of the consolidation of BraveArt Website Management, Promotion Experts, and Phoenix Creative Works, and has provided professional search engine placement and management services since 1997. http://www.stepforth.com/  Tel &#8211; 250-385-1190  Toll Free &#8211; 877-385-5526  Fax &#8211; 250-385-1198</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/bigdaddy-timeline-courtesy-of-matt-cutts-2006-05/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Matt Cutts Teaches Us To Crawl</title>
		<link>http://www.webpronews.com/matt-cutts-teaches-us-to-crawl-2006-04</link>
		<comments>http://www.webpronews.com/matt-cutts-teaches-us-to-crawl-2006-04#comments</comments>
		<pubDate>Mon, 24 Apr 2006 17:07:45 +0000</pubDate>
		<dc:creator>WebProNews Staff</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Bigdaddy]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Matt Cutts]]></category>
		<category><![CDATA[Proxy]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=28737</guid>
		<description><![CDATA[The Google engineer followed up his WebmasterWorld PubCon Boston discussion of Google's Bigdaddy infrastructure update and "crawl cache" with a lengthier look at the topic.
]]></description>
			<content:encoded><![CDATA[<p>The Google engineer followed up his WebmasterWorld PubCon Boston discussion of Google&#8217;s Bigdaddy infrastructure update and &#8220;crawl cache&#8221; with a lengthier look at the topic.</p>
<table width="400" border="0" cellpadding="2" cellspacing="0">
<tr>
<td align="center"><img src="http://images.ientrymail.com/webpronews/042406CuttsCrawl.jpg" alt="Matt Cutts Teaches Us To Crawl" width="400" height="200" border="0" class="irImage" title="Matt Cutts Teaches Us To Crawl"></td>
</tr>
<tr>
<td align="right" class="caption" style="padding-bottom: 10px; padding-left: 45px; padding-right: 45px;">Matt Cutts Discusses Cache Crawling</td>
</tr>
<tr>
<td align="center" class="caption" style="padding-bottom: 0px;"><img src="http://images.ientrymail.com/webpronews/salon/complete.gif" width="334" height="21"></td>
</tr>
</table>
<p>Cutts&#8217; latest <a href=http://www.mattcutts.com/blog/crawl-caching-proxy/ class=bluelink>blog post</a> reviewed Bigdaddy&#8217;s crawl-caching proxy in greater depth. He even provided helpful charts to illustrate the process.</p>
<p>As a webmaster, one may see numerous fetches from multiple Googlebots, each of them using some bandwidth while accomplishing their appointed rounds. It makes for a more accurate Google index, but the site impact has given some webmasters fits over the bandwidth usage.</p>
<p>The proxy used in the Bigdaddy infrastructure works like other proxies. It handles the effort of retrieving pages from websites, and fulfills requests from the various Google crawlers. Instead of multiple spiders hitting a website, they hit the cache instead.</p>
<p>Cutts breaks down the crawl caching in a summary during his post (spacing added; we like Matt, but we&#8217;d really like him to enjoy the Return key a bit more often <img src='http://www.webpronews.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  :</p>
<p><i>
<div style=margin-left:10px; margin-right:10px;>So the crawl caching proxy work like this: if service X fetches a page, and then later service Y would have fetched the exact same page, Google will sometimes use the page from the caching proxy. </p>
<p>Joining service X (AdSense, blogsearch, News crawl, any Google service that uses a bot) doesn&#8217;t queue up pages to be include in our main web index. Also, note that robots.txt rules still apply to each crawl service appropriately. If service X was allowed to fetch a page, but a robots.txt file prevents service Y from fetching the page, service Y wouldn&#8217;t get the page from the caching proxy. </p>
<p>Finally, note that the crawl caching proxy is not the same thing as the cached page that you see when clicking on the &#8220;Cached&#8221; link by web results. Those cached pages are only updated when a new page is added to our index. </p>
<p>It&#8217;s more accurate to think of the crawl caching proxy as a system that sits outside of webcrawl, and which can sometimes return pages without putting extra load on external sites.</p></div>
<p></i><br />
The essential goal of the proxy, to reduce bandwidth, seems to have worked to Google&#8217;s satisfaction. Cutts wrote that &#8220;it was working so smoothly that I didn&#8217;t know it was live.&#8221;</p>
<p>&#8212;</p>
<p>Add to <script language='javascript'> document.write("<a href='http://del.icio.us/post?url="+encodeURIComponent(document.location.href)+"&#038;title="+encodeURIComponent(document.title)+"'>Del.icio.us</a>")</script> | <a href="javascript:void window.open('http://digg.com/submit?phase=2&#038;url='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">DiggThis</a>  | <a href="javascript:void window.open('http://myweb2.search.yahoo.com/myresults/bookmarklet?t='+encodeURIComponent(document.title)+'&#038;u='+encodeURIComponent(window.location.href)+'&#038;tag=Matt Cutts,Bigdaddy,Google','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">Yahoo! My Web</a> | <a href="javascript:void window.open('http://www.prefound.com/group_finds.php?cmd_url='+encodeURIComponent(window.location.href)+'&#038;cmd_title='+encodeURIComponent(document.title),'popup','width=800px,height=500px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">PreFound.com</a></p>
<p>Bookmark WebProNews: <a href=http://www.webpronews.com><img src=http://images.ientrymail.com/webpronews/wpn-readit.jpg border=0></a> </p>
<p><script language=JavaScript src="http://aj.600z.com/aj/1095/0/vj?z=1&#038;dim=1088&#038;pos=15"></script></p>
<p>David Utter is a staff writer for WebProNews covering technology and business. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/matt-cutts-teaches-us-to-crawl-2006-04/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Matt Cutts On Bigdaddy, RK, And Emmy</title>
		<link>http://www.webpronews.com/matt-cutts-on-bigdaddy-rk-and-emmy-2006-03</link>
		<comments>http://www.webpronews.com/matt-cutts-on-bigdaddy-rk-and-emmy-2006-03#comments</comments>
		<pubDate>Wed, 29 Mar 2006 17:49:18 +0000</pubDate>
		<dc:creator>WebProNews Staff</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[Bigdaddy]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Matt Cutts]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=28071</guid>
		<description><![CDATA[The Google engineer who webmasters have learned to love and fear tackled a slew of questions posted to his blog recently. There are only four or five live Matt Cutts sightings in a given year...
]]></description>
			<content:encoded><![CDATA[<p>The Google engineer who webmasters have learned to love and fear tackled a slew of questions posted to his blog recently. There are only four or five live Matt Cutts sightings in a given year&#8230;</p>
<table width="400" border="0" cellpadding="2" cellspacing="0">
<tr>
<td align="center"><img src="http://images.ientrymail.com/webpronews/032906MattCutts.jpg" alt="Matt Cutts On Bigdaddy, RK, And Emmy" width="400" height="200" border="0" class="irImage" title="Matt Cutts On Bigdaddy, RK, And Emmy"></td>
</tr>
<tr>
<td align="right" class="caption" style="padding-bottom: 10px; padding-left: 45px; padding-right: 45px;">  Matt Cutts Answers Reader Questions</td>
</tr>
<tr>
<td align="center" class="caption" style="padding-bottom: 0px;"><img src="http://images.ientrymail.com/webpronews/salon/complete.gif" width="334" height="21"></td>
</tr>
</table>
<p>&#8230;as Cutts responded to a question about his likelihood of traveling to the UK to visit or speak, and regretfully a trip across the Atlantic isn&#8217;t in his plans. </p>
<p>That means the Clan of Cutts will have to attend events like Boston Pubcon and SES San Jose to catch a glimpse of him. However, he does make himself available on his well-regarded <a href=http://www.mattcutts.com/blog/q-a-thread-march-27-2006/ class=bluelink>blog</a>, most recently to answer a passel of questions from commenters.</p>
<p>He reiterated that link sellers should use the &#8220;nofollow&#8221; attribute to mark links they sell. &#8220;Not doing so can affect your reputation in Google,&#8221; he wrote. One can imagine the sonorous intonation of James Earl Jones sounding out that response.</p>
<p>The Bigdaddy software infrastructure update has been completed and fully deployed. This means webmaster should see different visits from Google.  Said Cutts, &#8220;You will probably see less crawling by the older Googlebot, which has a User-Agent of &#8220;Googlebot/2.1 (+http://www.google.com/bot.html)&#8221;. I believe crawling from the Bigdaddy infrastructure has a new User-Agent, which is &#8220;Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#8221;.&#8221;</p>
<p>But forget about seeing the RK parameter ever again. Cutts noted that the parameter used to be visible in a Google toolbar query. &#8220;I wouldn&#8217;t expect to see the RK parameter have a non-zero value again,&#8221; he wrote.</p>
<p>One commenter questioned Cutts about the 64.233.185.104 datacenter, saying it seemed to work differently from the others and asked if it mainly consisted of newly spidered data. &#8220;That wouldn&#8217;t surprise me,&#8221; said Cutts. &#8220;As Bigdaddy cools down, that frees us up to do new/other things.&#8221;</p>
<p>Although many follow Google&#8217;s most publicly known engineer and his writings, few may realized Cutts has an Emmy. Given that the Emmy in question is of the feline persuasion, it&#8217;s more likely Emmy has a Matt Cutts:</p>
<p><i>
<div style=margin-left:10px; margin-right:10px;>Q: &#8220;Do you take Emmy with you to San Francisco?&#8221;<br />
A: Nope, Emmy is a true indoors cat; she doesn&#8217;t like to travel.</div>
<p></i><br />
The question and answer thread offers some interesting insights into Google and webmaster issues. It is definitely worth a few minutes of the reader&#8217;s time today.</p>
<p>&#8212;</p>
<p>Add to <script language='javascript'> document.write("<a href='http://del.icio.us/post?url="+encodeURIComponent(document.location.href)+"&#038;title="+encodeURIComponent(document.title)+"'>Del.icio.us</a>")</script> | <a href="javascript:void window.open('http://digg.com/submit?phase=2&#038;url='+encodeURIComponent(window.location.href)+'&#038;ei=UTF-8','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">DiggThis</a>  | <a href="javascript:void window.open('http://myweb2.search.yahoo.com/myresults/bookmarklet?t='+encodeURIComponent(document.title)+'&#038;u='+encodeURIComponent(window.location.href)+'&#038;tag=Matt Cutts,Google,Bigdaddy','popup','width=520px,height=420px,status=0,location=0,resizable=1,scrollbars=1,left=100,top=50',0)">Yahoo! My Web</a></p>
<p>Drag this <a href=http://www.webpronews.com><img src=http://images.ientrymail.com/webpronews/wpn-readit.jpg border=0></a> to your Bookmarks.</p>
<p><script language=JavaScript src="http://aj.600z.com/aj/1095/0/vj?z=1&#038;dim=1088&#038;pos=15"></script></p>
<p>David Utter is a staff writer for WebProNews covering technology and business. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/matt-cutts-on-bigdaddy-rk-and-emmy-2006-03/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Take a Few Cutts from Googles Matt</title>
		<link>http://www.webpronews.com/take-a-few-cutts-from-googles-matt-2006-02</link>
		<comments>http://www.webpronews.com/take-a-few-cutts-from-googles-matt-2006-02#comments</comments>
		<pubDate>Thu, 02 Feb 2006 20:29:45 +0000</pubDate>
		<dc:creator>Lee Odden</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Bigdaddy]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Online]]></category>
		<category><![CDATA[Update]]></category>

		<guid isPermaLink="false">http://www.webpronews.com/?p=26562</guid>
		<description><![CDATA[Matt Cutts was <a href="http://www.webmasterradio.fm/episodes/index.php?showId=16" class="bluelink">recently interviewed</a> on WebmasterRadio.fm's SEO Rockstars show with Oilman and Webguerrilla.
]]></description>
			<content:encoded><![CDATA[<p>Matt Cutts was <a href="http://www.webmasterradio.fm/episodes/index.php?showId=16" class="bluelink">recently interviewed</a> on WebmasterRadio.fm&#8217;s SEO Rockstars show with Oilman and Webguerrilla.</p>
<p>Rand has posted some <a href="http://www.seomoz.org/blogdetail.php?ID=775" class="bluelink">insightful take aways</a> including Matt&#8217;s thoughts on Sandbox, BigDaddy, linkbait and that v7ndotcom malarkey.</p>
<p>If you need more Matt Cutts, then listen to this <a href="http://www.webmasterradio.fm/episodes/audio/2005/MW120205.mp3" class="bluelink">previous interview</a>, also at WebmasterRadio.fm and of course, you would do well to visit his <a href="http://www.mattcutts.com/blog/" class="bluelink">blog</a>, where he provided an <a href="http://www.mattcutts.com/blog/bigdaddy-progress-update/" class="bluelink">update on Bigdaddy</a> recently.</p>
<p>Bigdaddy is the rollout of some signigicant infrastructure changes at Google and to see Bigdaddy search results compared to current results, <a href="http://64.233.179.104/" class="bluelink">try this</a>. Although Matt says, &#8220;I&#8217;d expect a new data center to be converted to Bigdaddy roughly every 10 days&#8221;, so a normal search on Google.com will likely return Bigdaddy style results.</p>
<p>Lee Odden is President and Founder of<br />
<a href="http://www.toprankresults.com/">TopRank Online Marketing</a>, specializing in organic SEO, blog<br />
marketing and online public relations. He&#8217;s been cited as a search<br />
marketing expert by publications including U.S. News &#038; World Report and<br />
The Economist and has implemented successful search marketing programs<br />
with top BtoB companies of all sizes. Odden shares his marketing<br />
expertise at  <a href="http://www.toprankblog.com">Online Marketing Blog</a> offering<br />
daily news, interviews and best practices.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webpronews.com/take-a-few-cutts-from-googles-matt-2006-02/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using memcached
Database Caching 1/23 queries in 0.017 seconds using memcached
Object Caching 394/437 objects using memcached

Served from: webpronews.com @ 2012-02-13 03:25:44 -->
