Today WordPress.com was down for approximately 110 minutes, our worst downtime in four years. The outage affected 10.2 million blogs, including our VIPs, and appears to have deprived those blogs of about 5.5 million pageviews.
What Happened: We are still gathering details, but it appears an unscheduled change to a core router by one of our datacenter providers messed up our network in a way we haven’t experienced before, and broke the site. It also broke all the mechanisms for failover between our locations in San Antonio and Chicago. All of your data was safe and secure, we just couldn’t serve it.
What we’re doing: We need to dig deeper and find out exactly what happened, why, and how to recover more gracefully next time and isolate problems like this so they don’t affect our other locations.
I will update this post as we find out more, and have a more concrete plan for the future.
I know this sucked for you guys as much as it did for us — the entire team was on pins and needles trying to get your blogs back as soon as possible. I hope it will be much longer than four years before we face a problem like this again.
Update 1: We’ve gathered more details about what happened. There was a latent misconfiguration, specifically a cable plugged someplace it shouldn’t have been, from a few months ago. Something called the spanning tree protocol kicked in and started trying to route all of our private network traffic to a public network over a link that was much too small and slow to handle even 10% of our traffic which caused high packet loss. This “sort of working” state was much worse than if it had just gone down and confused our systems team and our failsafe systems. It is not clear yet why the misconfiguration bit us yesterday and not earlier. Even though the network issue was unfortunate, we responded too slowly in pinpointing the issue and taking steps to resolve it using alternate routes, extending the downtime 3-4x longer than it should have been.
- Feb 19, 2010 @ 12:00 am
Thanks for addressing this and informing us of the cause in a timely manner.
Always good to know the WP.com team is working so diligently. Especially when crazy ensues. Here’s hoping the issues behind this are isolated and properly resolved quickly with minimal fuss. :)
Thanks for the brief report/explanation, and for all the hard work you guys put in. Good to be up and running again.
I demand a refund! Oh wait…never mind! Totally joking here, BTW. Love what you folks do! :-)
Matt, this is the first time my blog has been down since I started it back in July. The response time by WordPress.com was quick and even this post went out quickly. Many users may not have even noticed. Thanks for explaining and I’m sure you will get to the bottom of it soon.
Sh*t happens, that’s the first outage I had to experience on wordpress.com, it’s not bad.
I for one truly appreciate your transparency about this problem. We all have things happen and nothing works 100% of the time. But its a rare organization that will step up, own it and learn from it! That’s that these things happen for–to learn! Kudos WordPress!
Yeah, it did suck. Twitter (and Huff Post) was a-twitter with comments, updates, speculations and conspiracy theories. The upside is you guys communicated pretty quickly and well — and though we couldn’t access our own blogs, all of us banded together in other ways, and held each other’s hands!
Also: Who knew there were 10.2 million WP blogs in the world?!
Thanks for solving the problem quickly. And yeah, four more years without an outage like this would be cool.
Thank you, for the update. No worries. Hugs to all of you
Thanks for the recovery efforts… it was kinda freakin’ me out. I can’t imagine what it was doing to you, who actually knew what was going on!
Thanks for giving us some info about it – I thought it was just my internet connection that was the problem!
You still provide a great service and I’d rather trust you guys with 2 hours of downtime in 4 years than some small time hosting company. And it’s also great that you’re quick and open to respond. Keep up the good work guys.
I was wondering why all WordPress blogs would not load. It took me like 30 minutes until it said “WordPress.com will be back soon!” Then it worked 10 minutes later. Thanks for fixing this problem ;-)
Wasn’t too bad really (at least for me)… glad you guys are up and running again! =D
Congratulations on the diligence you all showed fixing this.
Oh thank you so much! I was wondering why WordPress was down and why it took so long!
Thanks for keeping us informed.
When I was on Splinder Platform, being unable to post was a daily pity
The server was down at least a couple of hours each days … but it never happened to me (or to anyone else as far as I know ) to receive any kind of explanation or excuse
I was a little sad I had to lose my hold blog, as there in no way to import here from Splinder … but … it looks like I’ve found something …
(My english is terrible, hope you will understand I’m pleasantly surprised.)
Thanks for addressing the problem in a timely manner. I didn’t think the downtime was too bad, considering how wide-ranging it was.
That is an amazingly honest message. Lots of companies would have offered lame excuses or nothing but a “whoops!”
Let us know the cause when you get detailed information.
Keep doing the good job!
Keep up with the transparency and good luck with the fixing!
It was awful! I had to post something and it was down!
Thank you for letting us know! I honestly thought I was doing something wrong… or that it was Yahoo… or our ISP. Things happen. Y’all provide such an awesome service that this little snafu is just a hiccup. No doubt y’all will figure it all out. Again, thanks for letting us know.
I’m glad it’s working again, and thanks for the communication (otherwise I would probably blame my internet/router/computer)!
Your quick response time and explanation are much appreciated!
Wasn’t too bad Thank you for keeping us informed! And … you rock! :-)
I really appreciate your update.
Maybe it’s that worldwide hack attack I’ve been reading about.
Eh. As long as we didn’t lose any data, that’s the most important thing.
New to WP, I thought it was a problem with my blog. Apparently, it was not the case! Take it easy Matt and your colleagues.
Excellent service back on track and great communication. Thanks.
Thanks for your vigilance. Keep up the good work!
Love you guys!
thanks for the message…you guys are great.
First I was worried for my blog being the only one, but then I notice that my little Chihuahua was giving birth, so I spent 2 hours helping her and she finally gave birth to 3 gorgeous little tiny mini chihuahuas. Now that I’m back in my computer, my blog is working just fine, so don’t worry.
Keep up the good work people.
Have a good day!
God Bless you all for dealing with such a traumatic set of bewildering circumstances! I was editing a post when it happened and now and re-doing all my work. But that doesn’t bother me at all. I am focusing on what a great tool and service you provide!
Thanks. This happened on redroom to me a while back and much of my content was lost; everything is o.k. thanks for the updates on Twitter. j
Honesty is the best policy and yours is deeply appreciated. Glad all is working again. Glitches happen to everyone so we understand. Thanks for providing a wonderful service.
When I disabled my stats plugin, the pages would load but wouldn’t collect “most read.” When I enabled it and re-entered the API code, the pages still load but the plugin still isn’t working. Any advice, or is that still a problem on tour end?
A terrible problem, but a great response too! I appreciate the updates in secondary services (like Twitter), to keep us informed all the way.
Thanks and keep the good word, I’m very proud to be part of WordPress.com!
OK you had me worried there for a minute!
Sorry, I meant “keep the good work”, altough ‘word’ applies too! ;-)
my blog isnt updating…
Hey Thanks for the update! keep up the good work
OMG! I taught this was the end for my blog. I could not open the link to it. I was so woried. I hope nothing got affected. I taught all blogs would get deleted and we had to start over :shocked:
I’m sorry for your downtime, these things happen. What I really liked was you acknowledgeing that it sucked and gave us a reason, other than sorry for the inconvenience, blah, blah, blah …….
For future use, you might find out how to increase size of the ‘VIEW’ menu, its a fight all the way and I can barely read it ~ Regards, Lynn
Thanks for the great work!
Thanks for your great work :-)
Thank you for your info; very professional and most of all, very nice!
This platform has allowed even techno-ignorami such as me to maintain and build a blog with steadily increasing readership for over a year now, and at a price that is beyond generous.
Thank you for your explanation and your great company.
As usual, you fixed the problem, you’re all great. It’s a pleasure to have a blog that has that personal interaction and caring.
Thanks a lot for the explanation. And really not surprised!
One thought came across my mind, imagine governments tackle their issues, ask the right questions and explain them the way WordPress does. That would be a Utopia!!
Thanks for being open about the failure and keeping us up-to-date.
Whew! I thought I broke it. Great work getting things back on line.
Your advising us of what happened means a lot to me. It shows your commitment to your users and a high degree of respect for customer service. Thank you very much on my behalf.
You guys are incredible! To provide this level of service to millions of free users. I don’t use WordPress.com but I do use the self hosted version and my dashboard stats and I think Gravatar weren’t working so I also appreciate the quick response to the downtime
Luckily, it didn’t affect me or any of the blogs I go on. :)
I was confused when I found WordPress was down. Thanks for telling us what was up.
Thank you for updating us, working hard to get it up and running again and keeping us informed. Your organization has customer service at the forefront and I’m sure all of us WP users appreciate it. I for one certainly do. Thanks again.
Thanks for posting this. Much better than saying nothing. Good service!
thx for your information…. :-)
Okay seriously there are much worse things that could happen, you know…like an earthquake or tsunami.
Thanks for caring and for letting us know :D
Thank you for bringing us back!
Thank you for keeping us in the loop! Very much appreciated, still a great service.
Even we in Hindi,Gujarati and English are enjoying the service.
Thanks for your vigilance. Keep up the good work!
wow…”traumatic”, “scary”, “awful”…sure glad we didn’t go through that here in RP.
Alrighty, thank you very much for letting me know. Hope everything turns out good for all of you. I like the blog site. I have just been really busy.
Thanks so much for getting back up and running again!
Really great work. I love WP!
Blogger downtime had very similar symptoms! I would suggest you contacting google to see if they can share knowledge about that one and maybe help with the one you just had.
Thanks for the detailed downtime report. It helps to know.
Since it was on my top traffic day ever, I thought I had broken wordpress.
Thanks for taking the time to explain it to us!
I just appreciate the opportunity to share my blog so effectively!
Hopefully you’ll get the problem fixed on your end so life can go
back to a less stressful existence for you!
Good luck! ;)
thanx for the hard work
Wow, you admitted something went wrong….remarkable! Wish everyone had the business sense to do that.
Please keep doing all you’re doing. Yours is about as close to perfect a service as I’ve ever used. Rock on, photomatt and co.
Thanks for staying on it till it got fixed! Yall rock my face off!