GoogleBot Heralding Blog Search?

    April 28, 2004

GoogleBot’s been requesting non-existant files from root directories lately, leading some to suspect that Google’s planning a blog search addition to their results, possibly even a new link above the search box.

GoogleBot gaining momentum in race with SKyNet for dominance…

Matthew Mullenweg of the Photo Matt blog said, “watching my logs, I’ve been getting random requests from Googlebot for atom.xml and index.rdf files on this site and others. It’s always in the root or in relevant subdirectories (usually /blog or similar). All of these sites run WordPress, and I can promise there is no mention of or links to atom.xml or index.rdf anywhere. This means Googlebot is guessing that these files will be there.”

Evan Williams, former CEO of and now Google employee reemphasized GoogleBot’s search for nonexistant files: “in some cases, these files don’t exist where they are requested from, and never have, leading some to speculate that this Google bot is not just crawling links, but looking specifically for feeds.”

Williams mentioned that, “others have pointed out that these filenames are common for two “RSS-like” feed formats, but not for other common types.”

As Blogger, a Google company, recently moved to supporting Atom, an RSS alternative, this GoogleBot exclusion may indicate the exclusion of RSS fed blogs in Google’s future.

Williams gently dismisses this idea: “Is it more likely that this is not a calculated move, but that they are experimenting with crawling feeds in general and that, if they’re going to index them, they probably want as many as possible? And that maybe (hmmm…) they started with Blogger blogs first, since they were handy, and they tended to find feeds at index.rdf and atom.xml, and they haven’t yet optimized their crawler because they’ve been working on other stuff?”

Though Williams is a Google employee it’s important to keep in mind he’s only speculating. Or, as he says himself, “No one tells me anything.”

