Google Sitemaps Wants Your URLs

    April 5, 2006
    WebProNews Staff

Vanessa Fox, a technical writer for Google’s Sitemaps product, thinks you should be using Sitemaps with your site today. Here’s why.

“How do I get into Google?” It’s a common refrain found on websites, forums, blogs, and inboxes of tech writers. Google is the Saint Bernard in a pen full of adorable search advertising pugs, and everyone wants a sip of the Googlejuice in the cask the big dog carries.

To make it easier for site publishers to drink deeply of the traffic Google can deliver, the company launched Sitemaps in June 2005. Yesterday, Vanessa took us through the Sitemaps ecology to explain how it can benefit webmasters everywhere.

After signing in to the service and placing a file Google can verify on one’s website, Sitemaps begins collecting information about that site. Webmasters can select from several options under the Site Overview’s Stats tab for a sitemapped URL.

Query stats shown in Sitemaps list the top search queries and top search query clicks for a rolling three-week period. Those lists display the average top position for each of the 20 queries presented in each list.

Sitemaps also shows top search queries from mobile devices, and queries made via the mobile web. Any of the lists under query stats may be downloaded as a .csv file for easy import into Microsoft Excel, OpenOffice Calc, or other programs for analysis.

That .csv download represents one of the many features Vanessa said the Sitemaps team enhances or adds on about a monthly basis. They plan to continue doing so, but Vanessa declined to discuss forthcoming features.

Crawl stats add a level of transparency to the Google crawling process. They also show the PageRank distribution of all the pages the Googlebot has crawled within a site, and not just the home page.

A Status section of crawl stats displays a bar graph of successfully crawled pages, and those where Googlebot had problems with HTTP errors, URLs that times out or were not followed, or URLs restricted by robots.txt.

Sitemaps lists all of these errors by status under an Errors tab in the Site Overview.

The Page analysis area of the stats shows how the Googlebot sees one’s website. It lists the types of pages and encodings Googlebot finds.

Of more interest will be the Common Words section of page analysis. This lists the anchor text Googlebot found most commonly in a site’s content, and in external links to a site.

An Index stats section provides six typical advanced searches site publishers perform for their URLs. The much-loved backlink search query that begins with link: can be found here.

The Sitemaps approach to testing robots.txt allows the Webmaster to see if an existing robots.txt file excludes what it is supposed to restrict, or if it is restricting something the webmaster wants crawled.

The test always checks for Googlebot access issues automatically. Webmasters can also choose to task additional Googlebots to see if mobile, image, or AdSense content for media partners will be accessed correctly by those crawlers.

To best promote Sitemaps, Google made the protocol and the Sitemaps Generator available as open source projects. Several other versions of the Generator have been created as a result. As long as they follow the Sitemaps protocol, those maps will work for the website.

The Sitemaps approach, where the Webmaster places the XML file created with a Generator in the site’s directory where Googlebot can find it, can speed up the process where Google finds one’s web pages.

Google also permits the use of attributes with the Sitemap that tell the Googlebot how frequently to check particular pages for updates. Vanessa noted how this helps with sites that deliver a lot of dynamic content that normally may not be indexed correctly.

Webmasters do not have to create a XML file for Sitemaps; they may submit a list of URLs as a text file or a RSS feed for indexing. That’s one of the enhancements the Sitemaps team delivered since the product’s release.

Vanessa closed by citing the various support options available for Sitemaps. In addition to the product blog where she posts about Sitemaps issues, the service has a Google Group dedicated to it.

Several third-party solutions, like programs and websites, exist to support Google Sitemaps. They offer code snippets in a variety of programming languages and plugins that work with CMS and other application platforms.

It is important for a website today to be indexed promptly and accurately by Google. Precious few shortcuts exist that offer a legitimate approach to doing this. Google Sitemaps does offer one that any Webmaster can use right now.


DiggThis | Yahoo! My Web |

Drag this to your Bookmarks.

David Utter is a staff writer for WebProNews covering technology and business.