Facebook Explains Outage

Facebook Designing New Configuration System

Get the WebProNews Newsletter:

[ Social Media]

Facebook was down for over 2.5 hours for some users, according to a post from the company. A post in Facebook’s engineering notes says:

The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition. An automated system for verifying configuration values ended up causing much more damage than it fixed.

The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values from the persistent store. This works well for a transient problem with the cache, but it doesn’t work when the persistent store is invalid.

You can see more of the technical details here. Facebook has turned off the system that attempts to correct configuration values, and is exploring new designs for it. 

I attended a screening of the movie The Social Network last night, and Mark Zuckerberg’s character stressed how much downtime would hurt the reputation of the site, as he was getting it launched. I thought that was kind of funny, considering the timing. 

Facebook Explains Outage
Comments Off
Top Rated White Papers and Resources

Comments are closed.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom