Cutts On 404s Vs. 410s: Webmasters Shoot Often Themselves In The Foot
Google’s latest Webmaster Help video, unlike the one before it, is very webmaster oriented. In it, Matt Cutts discusses how Google handles 404s versus how it handles 410s.
“Whenever a browser or Googlebot asks for a page, the web server sends back a status code,” he says. ‘200 might mean everything went totally fine. 404 means the page was not found. 410 typically means ‘gone,’ as in the page is not found, and we do not expect it to come back. So 410 has a little more of a connotation that this page is permanently gone. So the short answer is that we do sometimes treat 404s and 410s a little bit differently, but for the most part, you shouldn’t worry about it. If a page is gone, and you think it’s temporary, go ahead and use a 404. If a page is gone, and you know no other page that should substitute for it…you don’t have anywhere else that you should point to, and you know that that page is gone and never coming back, then go ahead and serve a 410.”
“It turns out, webmasters shoot themselves in the foot pretty often,” he continues. “Pages go missing, people misconfigure sites, sites go down, people block Googlebot by accident, people block regular users by accident…so if you look at the entire web, the crawl team has to design to be robust against that. So 404, along with, I think, 401s and maybe 403s, if we see a page, and we get a 404, we are gonna protect that page for 24 hours in the crawling system. So we sort of wait, and we say, ‘Well, maybe that was a transient 404. Maybe it wasn’t really intended to be a page not found.’ And so in the crawling system, it will be protected for 24 hours. If we see a 410, then the crawling system says, ‘OK, we assume the webmaster knows what they’re doing because they went off the beaten path to deliberately say that this page is gone.’ So they immediately convert that 410 to an error, rather than protecting it for 24 hours.”
“Don’t take this too much the wrong way, Cutts adds. “We’ll still go back and recheck, and make sure, are those pages really gone or maybe the pages have come back alive again, and I wouldn’t rely on the assumption that that behavior will always be exactly the same. In general, sometimes webmasters get a little too caught up in tiny little details, and so if a page is gone, it’s fine to serve a 404. If you know it’s gone for real, it’s fine to serve a 410, but we’ll design our crawling system to try to be robust, but if your site goes down, or if you get hacked or whatever, that we try to make sure than we can still find the good content whenever it’s available.”
He also notes that these details can change. Long story short, don’t worry about it that much.
Image via YouTube