Azure Screen Of Death caused by expired SSL certificate will cost embarrassed Microsoft in SLA refunds
The Microsoft Windows Azure cloud suffered a major outage, when an SSL security certificate for Azure Storage expired, breaking business applications, and Microsoft’s own Xbox Live, for twelve hours on Friday.
The company has offered a refund, as specified in service level agreements, to those affected, but has not yet explained how it allowed a certificate to expire, which is needed for secure access to data stored on the Azure cloud. The incident is an uncanny reminder of the failure caused one year ago when Windows Azure failed to take account of the extra Leap Year day in February, and has had commentators’ jaws dropping.
Azure cloud of death
“On Friday, February 22, Windows Azure Storage was affected by an expired certificate,” said Adrienne Hall, general manager, Microsoft Trustworthy Computing. “We have completed the restoration and all services are back online. For more information please go to the Service Dashboard.”
Anyone storing critical data on a cloud service should be using SSL (secure sockets layer) to access that data, and when this failed, critical applications trying to use Azure Storage failed.
Discussion on the Windows Azure Forums quickly identified the problem and produced evidence in the form of a screenshot (shown here). The most obvious temporary fix was to switch off the secure protocol HTTPS, and use HTTP, but users running real business data on their Azure cloud instances might have been reluctant to do that.
Anger quickly swelled, with commentators lambasting Microsoft for a “trivial” error. “This is quite a stupid mistake,” said analyst Clive Longbottom of QuoCirca. “All the technology was working – they just forgot to pay the bill for a certificate.”
This kind of error is very easy to make, but this does not let Microsoft off the hook: “It does happen to private sites as well – regularly,” commented Longbottom.
Last year’s embarrassing Leap Year error also hit Windows Azure’s certificates and Microsoft did – eventually – publish a very full explanation of how the mistake was made.
This first appeared on TechWeekEurope UK. Read the whole story and view the video here.