Matt Cutts On The Hardware & Software That Power Googlebot
Google uploaded a new Webmaster Help video from Matt Cutts, which addresses a question about the hardware/server-side software that powers a typical Googlebot server.
“So one of the secrets of Google is that rather than employing these mainframe machines, this heavy iron, big iron kind of stuff, if you were to go into a Google data center and look at an example rack, it would look a lot like a PC,” says Cutts. “So there’s commodity PC parts. It’s the sort of thing where you’d recognize a lot of the stuff from having opened up your own computer,and what’s interesting is rather than have like special Googlebot web crawling servers, we tend to say, OK, build a whole bunch of different servers that can be used interchangeably for things like Googlebot, or web serving, or indexing. And then we have this fleet, this armada of machines, and you can deploy it on different types of tasks and different types of processing.”
“So hardware wise, they’re not exactly the same, but they look a lot like regular commodity PCs,” he adds. “And there’s no difference between Googlebot servers versus regular servers at Google. You might have differences in RAM or hard disk, but in general, it’s the same sorts of stuff.”
On the software side, Google of course builds everything itself, as to not have to rely on third-parties. Cutts says there’s a running joke at Google along the lines of “we don’t just build the cars oursevles, and we don’t just build the tires ourselves. We actually vulcanize the rubber on the tires ourselves.”
“We tend to look at everything all the way down to the metal,” Cutts explains. “I mean, if you think about it, there’s data center efficiency. There’s power efficiency on the motherboards. And so if you can sort of keep an eye on everything all the way down, you can make your stuff a lot more efficient, a lot more powerful. You’re not wasting things because you use some outside vendor and it’s black box.”
“In the same way that you might examine your electricity bill and then tweak the thermostat, we constantly track our energy consumption and use that data to make improvements to our infrastructure. As a result, our data centers use 50 percent less energy than the typical data center,” wrote Joe Kava, Senior Director, data center construction and operations at Google.
Cutts says Google uses a lot of Linux-based machines and Linux-based servers.
“We’ve got a lot of Linux kernel hackers,” he says. “And we tend to have software that we’ve built pretty much from the ground up to do all the different specialized tasks. So even to the point of our web servers. We don’t use Apache. We don’t use IIS. We use something called GWS, which stands for the Google Web Server.”
“So by having our own binaries that we’ve built from our own stuff and building that stack all the way up, it really unlocks a lot of efficiency,” he adds. “It makes sure that there’s nothing that you can’t go in and tweak to get performance gains or to fix if you find bugs.”
If you’re interested in how Google really works, you should watch this video too: