WP.com Downtime Summary
Today WordPress.com was down for approximately 110 minutes, our worst downtime in four years. The outage affected 10.2 million blogs, including our VIPs, and appears to have deprived those blogs of about 5.5 million pageviews.
What Happened: We are still gathering details, but it appears an unscheduled change to a core router by one of our datacenter providers messed up our network in a way we haven’t experienced before, and broke the site. It also broke all the mechanisms for failover between our locations in San Antonio and Chicago. All of your data was safe and secure, we just couldn’t serve it.
What we’re doing: We need to dig deeper and find out exactly what happened, why, and how to recover more gracefully next time and isolate problems like this so they don’t affect our other locations.
I will update this post as we find out more, and have a more concrete plan for the future.
I know this sucked for you guys as much as it did for us — the entire team was on pins and needles trying to get your blogs back as soon as possible. I hope it will be much longer than four years before we face a problem like this again.
Update 1: We’ve gathered more details about what happened. There was a latent misconfiguration, specifically a cable plugged someplace it shouldn’t have been, from a few months ago. Something called the spanning tree protocol kicked in and started trying to route all of our private network traffic to a public network over a link that was much too small and slow to handle even 10% of our traffic which caused high packet loss. This “sort of working” state was much worse than if it had just gone down and confused our systems team and our failsafe systems. It is not clear yet why the misconfiguration bit us yesterday and not earlier. Even though the network issue was unfortunate, we responded too slowly in pinpointing the issue and taking steps to resolve it using alternate routes, extending the downtime 3-4x longer than it should have been.
Feb 19th at 12:03 am
Thanks for this very quick update! We appreciate it!
Feb 19th at 12:04 am
Honestly, I was horrified when I heard WordPress was down, but you did a great job of keeping us informed and getting back to business. Thanks so much for that!
Feb 19th at 12:04 am
Thanks for all your hard work!
Feb 19th at 12:05 am
Thanks for the info…..
I love you guys…..You’re doing a great Job….
After I did a level 2 diagnostic check (A la Star Trek ) I knew it was the system and I just left things alone…
Keep up the good work!
Feb 19th at 12:06 am
We understand!:)
Feb 19th at 12:06 am
thanks for fixing it and many many thanks for keeping us posted via twitter
great crisis management!
Feb 19th at 12:06 am
congratulations, keep trying, and thanks for all your efforts…
Feb 19th at 12:06 am
I understand… since I can update my blog anytime on my end… but for how long will the downtime last?
Feb 19th at 12:25 am
Your blog was down, but now it is fine and you can update it whenever you like.
Feb 19th at 12:08 am
Thanks for addressing this and informing us of the cause in a timely manner.
Feb 19th at 12:08 am
Always good to know the WP.com team is working so diligently. Especially when crazy ensues. Here’s hoping the issues behind this are isolated and properly resolved quickly with minimal fuss.
Feb 19th at 12:08 am
Thanks for the brief report/explanation, and for all the hard work you guys put in. Good to be up and running again.
Feb 19th at 12:08 am
I demand a refund! Oh wait…never mind! Totally joking here, BTW. Love what you folks do!
Feb 19th at 12:11 am
Matt, this is the first time my blog has been down since I started it back in July. The response time by WordPress.com was quick and even this post went out quickly. Many users may not have even noticed. Thanks for explaining and I’m sure you will get to the bottom of it soon.
@Ileane
Feb 19th at 12:11 am
Sh*t happens, that’s the first outage I had to experience on wordpress.com, it’s not bad.
Feb 19th at 12:12 am
I for one truly appreciate your transparency about this problem. We all have things happen and nothing works 100% of the time. But its a rare organization that will step up, own it and learn from it! That’s that these things happen for–to learn! Kudos WordPress!
Feb 19th at 12:12 am
Yeah, it did suck. Twitter (and Huff Post) was a-twitter with comments, updates, speculations and conspiracy theories. The upside is you guys communicated pretty quickly and well — and though we couldn’t access our own blogs, all of us banded together in other ways, and held each other’s hands!
Also: Who knew there were 10.2 million WP blogs in the world?!
Thanks for solving the problem quickly. And yeah, four more years without an outage like this would be cool.
Feb 19th at 12:14 am
Thank you, for the update. No worries. Hugs to all of you
Feb 19th at 12:17 am
Thanks for the recovery efforts… it was kinda freakin’ me out. I can’t imagine what it was doing to you, who actually knew what was going on!
Feb 19th at 12:18 am
Thanks for giving us some info about it – I thought it was just my internet connection that was the problem!
Feb 19th at 12:19 am
You still provide a great service and I’d rather trust you guys with 2 hours of downtime in 4 years than some small time hosting company. And it’s also great that you’re quick and open to respond. Keep up the good work guys.
Feb 19th at 12:19 am
I was wondering why all WordPress blogs would not load. It took me like 30 minutes until it said “WordPress.com will be back soon!” Then it worked 10 minutes later. Thanks for fixing this problem
Feb 19th at 12:19 am
Wasn’t too bad really (at least for me)… glad you guys are up and running again! =D
Feb 19th at 12:20 am
Congratulations on the diligence you all showed fixing this.
Feb 19th at 12:21 am
Oh thank you so much! I was wondering why WordPress was down and why it took so long!
Feb 19th at 12:22 am
Thanks for keeping us informed.
Feb 19th at 12:22 am
When I was on Splinder Platform, being unable to post was a daily pity
The server was down at least a couple of hours each days … but it never happened to me (or to anyone else as far as I know ) to receive any kind of explanation or excuse
… Thanks!
I was a little sad I had to lose my hold blog, as there in no way to import here from Splinder … but … it looks like I’ve found something …
(My english is terrible, hope you will understand I’m pleasantly surprised.)
Feb 19th at 12:24 am
Thanks for addressing the problem in a timely manner. I didn’t think the downtime was too bad, considering how wide-ranging it was.
Feb 19th at 12:24 am
That is an amazingly honest message. Lots of companies would have offered lame excuses or nothing but a “whoops!”
Feb 19th at 12:25 am
Let us know the cause when you get detailed information.
Keep doing the good job!
Feb 19th at 12:26 am
Keep up with the transparency and good luck with the fixing!
Feb 19th at 12:27 am
It was awful! I had to post something and it was down!
Feb 19th at 12:27 am
Thank you for letting us know! I honestly thought I was doing something wrong… or that it was Yahoo… or our ISP. Things happen. Y’all provide such an awesome service that this little snafu is just a hiccup. No doubt y’all will figure it all out. Again, thanks for letting us know.
Feb 19th at 12:28 am
I’m glad it’s working again, and thanks for the communication (otherwise I would probably blame my internet/router/computer)!
Feb 19th at 12:28 am
Your quick response time and explanation are much appreciated!
Feb 19th at 12:29 am
Wasn’t too bad Thank you for keeping us informed! And … you rock!
Feb 19th at 12:29 am
I really appreciate your update.
Feb 19th at 12:29 am
Maybe it’s that worldwide hack attack I’ve been reading about.
Feb 19th at 12:32 am
There were no hacks or denial of services attacks.
Feb 19th at 12:29 am
Eh. As long as we didn’t lose any data, that’s the most important thing.
Feb 19th at 12:29 am
New to WP, I thought it was a problem with my blog. Apparently, it was not the case! Take it easy Matt and your colleagues.
Feb 19th at 12:29 am
Excellent service back on track and great communication. Thanks.
Feb 19th at 12:30 am
Thanks for your vigilance. Keep up the good work!
Feb 19th at 12:31 am
Love you guys!
Feb 19th at 12:32 am
thanks for the message…you guys are great.
Feb 19th at 12:32 am
First I was worried for my blog being the only one, but then I notice that my little Chihuahua was giving birth, so I spent 2 hours helping her and she finally gave birth to 3 gorgeous little tiny mini chihuahuas. Now that I’m back in my computer, my blog is working just fine, so don’t worry.
Keep up the good work people.
Have a good day!
Feb 19th at 12:34 am
God Bless you all for dealing with such a traumatic set of bewildering circumstances! I was editing a post when it happened and now and re-doing all my work. But that doesn’t bother me at all. I am focusing on what a great tool and service you provide!
Feb 19th at 12:35 am
Thanks. This happened on redroom to me a while back and much of my content was lost; everything is o.k. thanks for the updates on Twitter. j
Feb 19th at 12:35 am
Honesty is the best policy and yours is deeply appreciated. Glad all is working again. Glitches happen to everyone so we understand. Thanks for providing a wonderful service.
Feb 19th at 12:39 am
When I disabled my stats plugin, the pages would load but wouldn’t collect “most read.” When I enabled it and re-entered the API code, the pages still load but the plugin still isn’t working. Any advice, or is that still a problem on tour end?
Rob
Feb 19th at 12:39 am
A terrible problem, but a great response too! I appreciate the updates in secondary services (like Twitter), to keep us informed all the way.
Thanks and keep the good word, I’m very proud to be part of WordPress.com!
Feb 19th at 12:40 am
OK you had me worried there for a minute!
Feb 19th at 12:41 am
Sorry, I meant “keep the good work”, altough ‘word’ applies too!
Feb 19th at 12:44 am
my blog isnt updating…
Feb 19th at 12:46 am
Hey Thanks for the update! keep up the good work
Feb 19th at 12:48 am
OMG! I taught this was the end for my blog. I could not open the link to it. I was so woried. I hope nothing got affected. I taught all blogs would get deleted and we had to start over :shocked:
Feb 19th at 12:49 am
I’m sorry for your downtime, these things happen. What I really liked was you acknowledgeing that it sucked and gave us a reason, other than sorry for the inconvenience, blah, blah, blah …….
For future use, you might find out how to increase size of the ‘VIEW’ menu, its a fight all the way and I can barely read it ~ Regards, Lynn
Feb 19th at 12:49 am
Thanks for the great work!
Feb 19th at 12:51 am
Thanks for your great work
Feb 19th at 12:55 am
Thank you for your info; very professional and most of all, very nice!
Feb 19th at 12:55 am
Dear Matt,
This platform has allowed even techno-ignorami such as me to maintain and build a blog with steadily increasing readership for over a year now, and at a price that is beyond generous.
Thank you for your explanation and your great company.
David
Feb 19th at 12:56 am
As usual, you fixed the problem, you’re all great. It’s a pleasure to have a blog that has that personal interaction and caring.
Feb 19th at 12:57 am
Thanks a lot for the explanation. And really not surprised!
One thought came across my mind, imagine governments tackle their issues, ask the right questions and explain them the way WordPress does. That would be a Utopia!!
Feb 19th at 12:58 am
Thanks for being open about the failure and keeping us up-to-date.
Feb 19th at 12:58 am
Whew! I thought I broke it. Great work getting things back on line.
Feb 19th at 12:59 am
Your advising us of what happened means a lot to me. It shows your commitment to your users and a high degree of respect for customer service. Thank you very much on my behalf.
Feb 19th at 1:00 am
You guys are incredible! To provide this level of service to millions of free users. I don’t use WordPress.com but I do use the self hosted version and my dashboard stats and I think Gravatar weren’t working so I also appreciate the quick response to the downtime
Feb 19th at 1:03 am
Luckily, it didn’t affect me or any of the blogs I go on.
Feb 19th at 1:06 am
I was confused when I found WordPress was down. Thanks for telling us what was up.
Feb 19th at 1:07 am
Thank you for updating us, working hard to get it up and running again and keeping us informed. Your organization has customer service at the forefront and I’m sure all of us WP users appreciate it. I for one certainly do. Thanks again.
Feb 19th at 1:08 am
Thanks for posting this. Much better than saying nothing. Good service!
Feb 19th at 1:08 am
thx for your information….
Feb 19th at 1:09 am
Okay seriously there are much worse things that could happen, you know…like an earthquake or tsunami.
Thanks for caring and for letting us know
Feb 19th at 1:20 am
Thank you for bringing us back!
Feb 19th at 1:21 am
Thank you for keeping us in the loop! Very much appreciated, still a great service.
Feb 19th at 1:22 am
Great Service.
Even we in Hindi,Gujarati and English are enjoying the service.
Thanks for your vigilance. Keep up the good work!
Rajendra M.Trivedi,M.D.
Feb 19th at 1:22 am
wow…”traumatic”, “scary”, “awful”…sure glad we didn’t go through that here in RP.
Feb 19th at 1:23 am
Alrighty, thank you very much for letting me know. Hope everything turns out good for all of you. I like the blog site. I have just been really busy.
honeybeez/unfat tips
Feb 19th at 1:24 am
Thanks so much for getting back up and running again!
Feb 19th at 1:24 am
Really great work. I love WP!
Feb 19th at 1:25 am
http://blogger-status.blogspot.com/2009/11/postmortem-on-recent-issues-affecting.html
Blogger downtime had very similar symptoms! I would suggest you contacting google to see if they can share knowledge about that one and maybe help with the one you just had.
Feb 19th at 1:27 am
Thanks for the detailed downtime report. It helps to know.
Since it was on my top traffic day ever, I thought I had broken wordpress.
Feb 19th at 1:27 am
Thanks for taking the time to explain it to us!
I just appreciate the opportunity to share my blog so effectively!
Hopefully you’ll get the problem fixed on your end so life can go
back to a less stressful existence for you!
Thanks again!
Good luck!
Feb 19th at 1:28 am
Stuff happens…
PT/TB
Feb 19th at 1:29 am
thanx for the hard work
nice
Feb 19th at 1:32 am
Wow, you admitted something went wrong….remarkable! Wish everyone had the business sense to do that.
Feb 19th at 1:35 am
Please keep doing all you’re doing. Yours is about as close to perfect a service as I’ve ever used. Rock on, photomatt and co.
Feb 19th at 1:36 am
Thanks for staying on it till it got fixed! Yall rock my face off!