Akamai and Amazon Hit by Outages

By Sean Michael Kerner | Aug 9, 2011 | Print this Page
http://www.enterprisenetworkingplanet.com/netsysm/cloudoutage.html

For enterprises that outsource their web site and application hosting to the cloud or a content delivery network (CDN), performance is often one of the key drivers for why they moved in the first place. Both Cloud and CDN providers offer the promise of better performance with guaranteed uptimes and Service Level Agreements (SLAs).

Yesterday for both Akamai and Amazon, uptime was an issue as both services experienced outages.

An Akamai spokesperson confirmed to InternetNews.com that they did have an outage incident on Monday August 9th that started at approximately 3:30 p.m. ET. The outage lasted for 30 minutes.

"A configuration change to our DNS infrastructure intended to remove some machines from service created an inconsistency between the IPv4 and IPv6 configurations for our DNS servers," Akamai's spokesperson said. "This otherwise routine change caused some DNS servers to restart repeatedly."

The spokesperson explained that the Akamai content delivery network system uses multiple techniques that are intended to automatically recover from changes that cause improper restarts. The Akamai DNS servers contain numerous software components, but the impacted component used an earlier version of crash rejection, which only rejected messages that directly caused an immediate crash.

"We’ve already begun working on the fix to handle the inconsistency between IPv4 and IPv6 and will be upgrading the crash rejection logic in the DNS software," the Akamai spokesperson said.

Akamai hosts some of the most popular websites in the world on its network including apple.com. The Akamai spokesperson was unable to discuss which specific Akamai customer may or may not have been impacted by the Monday outage. That said, according to Akamai the impact of the outage was relatively small.

"I can say that out of all traffic carried by Akamai in a 24 hour period, the DNS issue impacted only 0.3 percent of total traffic," the Akamai spokesperson said.

Amazon's EC2 cloud service also experienced an outage on Monday with trouble in its U.S. East datacenter in Virgina as well as its EC2 cloud in Ireland. The Virginia outage was reported on Amazon's AWS dashboard at 7:39 PDT as a connectivity issue. By 8:03 PDT, Amazon reported that full connectivity had been restored and that the service is operating normally.

The situation in Ireland was first reported late Sunday and as of Tuesday AM Amazon was still working on restoring customer data.

"We have now delivered recovery snapshots for over half of the volumes that were in an inconsistent state as a result of the power outage," Amazon wrote on its AWS Dashboard at 8:06 AM PDT. "We are continuing to make steady progress on creation and delivery of the remaining recovery snapshots."

The Virginia and Ireland outages follow an outage in late April, that Amazon attributed to, 'human error.'

 

The recent outages at Amazon may be a cause of concern for those considering a move to the cloud and it could also call into question issues of security, according to at least one security researcher.

"Amazon’s cloud service offerings explicitly state that they make no promises in regards to the confidentiality, integrity, or availability (the security CIA triad) of their customers’ data," Francis Brown, security researcher at Stach & Liu told InternetNews.com. "Recent outages confirm their sincerity about availability, which can only make you wonder about the confidentiality and integrity of your data."

 

Sean Michael Kerner is a senior editor at InternetNews.com, the news service of Internet.com, the network for technology professionals.