The fine print in service-level agreements (SLAs) exists for reasons beyond placating corporate legal counsel. Aggressive guarantees around availability, reliability and performance can help providers attract customers, but what happens when things go awry and a provider fails to meet the terms of its cloud computing SLA?
Cloud providers may find that making high-reaching promises around service delivery is the easy part; following up when things go bad is where it gets tricky.
Everybody knows things break -- it's the nature of life. It's how you respond to things breaking that sets you apart as a cloud service provider.
"Everybody knows things break -- it's the nature of life. It's how you respond to things breaking that sets you apart as a cloud service provider," said Chris Drumgoole, vice president of global operations at Terremark, a Miami-based cloud provider and wholly owned subsidiary under Verizon Communications Inc.
When an SLA violation occurs, it's safe to say that providers will be dealing with unhappy customers. If a provider breaks one of the promises in a cloud computing SLA, the consequences can threaten the profitability of the customer's company. Luckily, there are ways to make the post-violation process less dire for all involved.
Planning for 'what if' scenarios
When service providers and their customers negotiate an SLA, both parties should discuss the "what if" scenarios up front. This way, there are no surprises about the remediation process after an SLA violation occurs.
"[Cloud providers] need to tell their potential customers that 'This SLA is our goal, and if we don't meet our goal, we're going to call you up and blow kisses at you to show you that we really care. We will show you that we have a process in place that is going to fix [any issues that arise],'" said president of CIMI Corp., Tom Nolle.
Terremark has what it calls "incident procedures" that dictate how it responds to an SLA violation, Drumgoole said. If a serious incident occurs, Terremark staff is prepared to sit down with customers and go over what happened line by line.
"We formally follow up with the customer with a written post-mortem, including a timeline that states what happened, who or what screwed up, or what human failure occurred," he said. "We could also address what process or technology failure there was, or whatever specifically happened to make our services not live up to expectations. We inform them of the corrective action we take to remedy certain situations."
Customers are more at ease when these kinds of detailed procedures are in place -- it ensures confidence that cloud providers are ready and able to respond when something goes wrong. This type of transparency is demanded, especially by large enterprise customers.
It's advisable to involve high-ranking members of your cloud services staff in the remedy processes. Sometimes enterprise IT staff members are just looking for assurance that they won't incur the wrath of upper management when services go down, and they need the provider to take responsibility and offer them some refuge.
"Cloud providers should clue in customers that there is a plan every step of the way," said Nolle. "Maybe at a certain point [in the remediation process], the cloud provider CEO has to call the CEO of the [customer company]. This makes customers feel better knowing that the CEO of the cloud company is going to have to grovel before the customer CEO if services go down for an extended period of time."