Thursday 24 September 2015

Why the AWS Crash Should Make You Wary of Outsourcing Your Cloud

When Amazon Web Services crashed on Sunday, it brought down a host of high-needs clients with it. These clients are hosted at the North Virginia US-EAST-1 site, which is the AWS’s oldest facility for the public cloud.

AWS is the proud home of prestigious clients like Netflix, Amazon, IMDb, and Tinder. When it collapsed on Sunday, users faced disruptions to some of their favorite online services.

This was not the first AWS outage and it may not be the last. The first major outage occurred in 2011; in this crash, the same facility suffered outages when it ran out of capacity. It took down sites like Reddit and Foursquare and was considered the worst cloud crash of all time.

Shortly after the 2011 crash, Amazon Web Services released a report detailing what caused the outage. The report described how the AWS cloud works and what caused the problems. It also detailed a plan of action to prepare for events like these in the future.

Netflix saved by chaos

By the Monday morning following the 2015 outage, Amazon was well into recovery mode. However, in a fast-paced world, a day of Web outages comes with a price tag in the millions of dollars. Fortunately, some clients, like Netflix, were already prepared for the outage.

Netflix, which relies on AWS to facilitate its worldwide streaming operations, did not see any major damage as a result of the outages because it practices something known as chaos theory. It attacks its own services with software that is designed to cause chaos within its systems. This theory works because it prepares a company to deal with even the worst case scenarios.

As a result, Netflix was able to handle Sunday’s threats. Instead of finding itself blindsided by the outage, Netflix rerouted traffic away from the AWS region to functioning data centers elsewhere. As a result, Netflix was resilient to the issues thrown at it by AWS because it does not remain solely dependent on its services.

Outages happen

Netflix is proof that there is no need to veer away from outsourcing your cloud. Most of the time, outsourcing with a cloud service like AWS is a great way to run your website. But, you should never wander naively into using a cloud service; this would leave you blindsided when problems arise.

It is important to remember that whether you host your own cloud or hand it to Amazon, outages happen. They will happen in your office just like they would happen in anyone else’s office.

AWS is not the only cloud to have ever suffered. Office 365 and Gmail have both experienced outages at some point.
So, rather than abandon the cloud or outsourcing, you need a backup plan to deal with possible issues.

Although the cloud itself is a backup policy, you actually need to have a backup plan for your backup plan. This is because there are only two rules in computing: technology evolves and technology fails. Failures do not discriminate.

Netflix’s use of chaos theory is an insurance policy—an expensive insurance policy—but it is there to save the company millions of dollars from inevitable failures.

Resiliency Trade-Off

Fortunately, there is a way to prepare yourself for successful cloud use in good times and bad. It is called a resiliency trade-off. For this, you must either assume that the service is robust or that the service is fragile. When you assume a service is robust, you believe that your infrastructure is fragile. As a result, you work hard to ensure that the infrastructure is as close to perfect as possible. It allows for companies to have room to work on expensive infrastructure that helps offset the issues inherent in technology.

This is important because the more pieces there are, the more likely that some part of the system will go wrong. Companies like Amazon and Netflix realize that even when they have robust hardware, something will break. They have adopted this assume robust method to deal with the almost guaranteed breakages.

This works better than the other side of the trade-off: assuming fragile. Adding infrastructure and software to fix outages as soon as possible is hard and expensive. But it is still less expensive than going the cheaper route and having to fix everything when it breaks—and it will break.

The AWS outage should make you wary about outsourcing your cloud computing. If you have never worried that your cloud will fail, AWS is happy to remind you that it can and will. Whether you outsource your computing or do it in your garage, problems are inherent in technology.

The real lesson that all businesses can learn from AWS, and Netflix by extension, is that disaster preparedness is the key to survival.

The post Why the AWS Crash Should Make You Wary of Outsourcing Your Cloud appeared first on AllBusiness.com.

No comments:

Post a Comment