Quantcast

Why Your Robots.txt Blocked URLs May Show up in Google

Matt Cutts Talks Uncrawled URLs in Search Results

Get the WebProNews Newsletter:


[ Search]

Matt Cutts has appeared in yet another Google Webmaster Video, and this time he has a whiteboard with him so he can illustrate what he’s talking about. What he’s talking about this time are uncrawled URLs in search results.

Cutts says Google gets a lot of complaints from webmasters who say the search engine is violating their robots.txt files, with which they intend to keep Google from crawling certain pages. Sometimes those URLs still end up in search results.

According to Matt, what is happening in most cases is that when someone’s saying "I blocked example.com/go" in robots.txt, it turns out that the snippet Google returns in search results just brings back a URL with no text for the snippet. The reason for this is that Google didn’t actually crawl the page.

"It did abide by robots.txt. You told us this page is blocked, so we did not fetch this page," says Matt. It is a URL reference. "We saw a link to it, but we didn’t fetch the page itself," he explains.

Google didn’t actually fetch the page itself, and that’s why there’s no text snippet. In case you were wondering what the point of showing them at all is, Cutts breaks out an example looking at the California DMV, whose site is: www.dmv.ca.gov.

Cutts notes that at one point the California Department of Motor Vehicles had a robots.txt that blocked all search engines. "Now these days pretty much every site is savvy enough, you know, at one point the New York Times and eBay and a whole bunch of different sites would use robots.txt," he says.

If someone searches for "California DMV" in Google, there’s pretty much only one answer, he says. So that is the answer that Google wants to return. Luckily for Google a lot of people were linking to that page with the anchor text "California DMV". That helps Google be able to return the result without having to crawl the page.

Cutts also says that they can get descriptions from a directory like the Open Directory Project (DMOZ). He cites Nissan and Metallica.com as examples of sites that used to block Google with robots.txt. They had been listed in the Open Directory Project, however, and Google went and got the information from there to include as the snippet.

When this type of thing happens, it looks like the page was crawled, when in fact it wasn’t. "So we are able to return something that can be very helpful to users without violating robots.txt by not crawling that page," says Cutts.

He also notes that when you don’t want pages to show up, you can use the "noindex" meta tag at the top of the page. When Google sees this tag, it drops the page from its search results completely. Another option is the URL removal tool.

Why Your Robots.txt Blocked URLs May Show up in Google
Top Rated White Papers and Resources
  • http://ajemailmarketing-software.blogspot.com Email Marketing Tools

    Useful information for SEO beginners…It helps to improve website traffic and search engine ranking. Thanks for sharing. Keep posting!

  • http://www.lexolutionit.com Maneet Puri

    There were quite some instances when I saw some pages for which I had assigned robot.txt show up in search results. Thanks for the informative post :)

  • http://www.cappadociacavetravel.com cemal

    Plan your adventure with Cappadocia Tours, your personal consultant for Turkey. Started in Cappadocia and now expending its service web in all country, and soon getting international, specializing in customized escorted tours.

    Our approach will provide opportunities to meet locals live in their daily routine as well as traditional events. You will be taken to deep Turkey, like Cappadocia, off the beaten trail, to the mountain villages where the life goes in the same pace as thousands of years ago.

    Accommodation is in selected hotels with charm, representing the character of the area you visit. The warmth of your hosts will make a big difference to your look at the region. We arrange historical restored castles or monasteries, Ottoman palaces and cave hotels.

    • http://www.cappadociacavetravel.com Muskara Travel Agency

      Plan your adventure with Cappadocia Tours, your personal consultant for Turkey. Started in Cappadocia and now expending its service web in all country, and soon getting international, specializing in customized escorted tours.

      Our approach will provide opportunities to meet locals live in their daily routine as well as traditional events. You will be taken to deep Turkey, like Cappadocia, off the beaten trail, to the mountain villages where the life goes in the same pace as thousands of years ago.

      Accommodation is in selected hotels with charm, representing the character of the area you visit. The warmth of your hosts will make a big difference to your look at the region. We arrange historical restored castles or monasteries, Ottoman palaces and cave hotels.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom