Essential Guide

Browse Sections


Amazon cloud outage underscores limits of automation

In this week-in-review podcast, our editors discuss the implications of the latest Amazon cloud outage and the human error that caused it.

Yet another service disruption at Amazon Web Services (AWS) added to the company's hall of shame last month, marking the fourth Amazon cloud outage of 2012. But this time the failure couldn't be blamed on thunder and lightning or software bugs. In its postmortem of the event, Amazon acknowledged that human error was responsible for several hours of downtime for some of its customers, including high-profile AWS user Netflix.

CloudCast Weekly Archives

Catch up on other episodes of CloudCast Weekly

Site editor Jessica Scarpati and news writer Gina Narcisi return with their first week-in-review podcast for 2013 to discuss this story and more for the week of Jan. 21, 2013:

The following is a transcript of the podcast.

Jessica Scarpetti: You're listening to Cloud Cast Weekly a podcast by I'm Jessica Scarpetti, site editor of With me, as always, is Gina "Gee Whiz" Narcissi here to about cloud news.

Gina Narcissi: Hi, Jessica. It's nice to be back for our first Cloud Cast of 2013.

Scarpetti: I know. It's our big comeback after a kind of a long break.

Narcissi: That's right.

Scarpetti: 2012 kind of ended with not a bang, but a whimper for Amazon after they had yet another outage. Gina, why don't you tell us a little bit about what you found out about that?

Narcissi: Yeah, Jessica, so basically I went into this thinking it was going to be sort of a "day two" story about Amazon Web Services crashing on Christmas Eve, and I don't want to say "crashing" necessarily, but their load balancers ended up not doing what they were supposed to do, essentially, because a programmer had accidentally deleted some information that they needed. I think about 6% of those load balancers were not functioning properly and they had to kind of shut down the rest. It did affect some big customers like Netflix. Then when I was working on this story, I kind of took the angle of load balancing in the cloud. It's something that, obviously, all cloud providers need, but I guess I was under the impression that there was more automation than there actually is in the cloud. This type of error can kind of happen to anybody. It's human error. I ended up talking to InfoBlox, which is a network management vendor.

They told me about their NetMRI solution, which can kind of solve some of this. It kind of takes the human error element -- It lessens it, for sure, because there's more automation in cloud environments. They have a lot of cloud provider customers now, and it's growing. There are just a lot of little pieces that need to be controlled in an environment like that. That's one way you can go about it. I also spoke to Riverbed, who had a customer for me, a cloud provider, Joint, who I've spoken to before. They were telling me that they actually offer load balancing sort of as a service the way Amazon does where they control it for you, but they also offer it for customers to control themselves, so if they mess up or if they delete information, it's only going to affect that one customer themselves. It's not going to affect everyone the way Amazon's failure affected everyone.

Scarpetti: It sounds like what you're saying is, with Joint, one person is in charge of their configurations only?

Narcissi: Yeah, they have both. One person can be in charge of the customers or the customers will be in charge of themselves, which is two methods that are out there right now. Maybe for some cloud providers like Amazon, maybe don't want to put that much control into the hands of the customers, but I did speak to several analysts who thought that there needs to be more customer control.

Scarpetti: You brought up a good point earlier about automation. Everyone does think that the cloud is so automated and all of these things kind of happen behind the curtain without any human intervention, but there is still some element to it, it sounds like.

Narcissi: For sure. I brought it up to one of the analysts I was talking to because I just asked, "I just thought all this stuff just sort of happens." He told me absolutely not. It's actually a lot less than everyone thinks.

Scarpetti: This has been Amazon's -- I don't even know what number failure this is. Do you think this is going to affect their standing at all?

Narcissi: I'm not sure, just because a lot of people think that it's still Amazon no matter how you cut it, and an error like this is different than their other failures. A lot of their other failures had to do with their data center, the weather, and stuff that was out of their control. At the same time, other providers in the same areas didn't have these failures during storms. This was sort of different, this element of human error, so I guess it's yet to be seen what customers really think and how reliable they are.

Scarpetti: Especially since it sounded like it was kind of small scale. It seems that I think the way customers are probably going to look at is like, "Here we go, just one more outage from Amazon."

Narcissi: Right, and it doesn't help that they have a big customer like Netflix relying on them. I saw this on the news right on Christmas day, so it's out there. It's more in your face than other providers, for sure.

Scarpetti: All right, Gina, thank you so much.

Narcissi: Thanks, Jessica.

Scarpetti: It's a new year, which means it's the season for market predictions and forecasts. Resident cloud expert Tom Nolle, president of CIMI Corporation, gives us his view for 2013 in his latest piece for in an article titled; "Five Big Provider Trends in 2013: The Future of Cloud Computing is PaaS." We admit the title is kind of a spoiler, but there's much more to the story. Infrastructure as a Service is already on the way to becoming commoditized. Any network operator that's been selling bandwidth for the past 10 years knows how that movie ends, so cloud providers in need of higher profit margins are likely to find them in Platform as a Service. The fact that it's a higher level service is just one part of it, though. The real money in PaaS won't just come from selling it on its own; providers need a strategy. That strategy should include working with developers who create applications that are built and optimized for the cloud from Day One. If providers are lucky, they'll wind up building a whole ecosystem to go along with those apps. Check out Tom's article to find out how Platform as a Service and other trends like SOA and OpenStack are going to shape the cloud provider market this year.

That brings us to the end of Cloud Cast Weekly. Be sure to check out all of the articles we talked about and more on Thanks for listening.

Dig Deeper on Managed IaaS

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.