Why Does the Sad Passing of Michael Jackson Prove the Need for Cloud and Dynamic Compute?
This has been one heck of a sad week for the entertainment industry. Ed McMahon, Farrah Fawcett and Michael Jackson passed away this week. Whenever significant cultural events like this occur, there is an explosion in communication among people, wanting to know what happened and further discuss it with their peers. In the past, this would have been isolated to talking with your neighbors, family, and friends, either in person or over a traditional POTS line. Fast forward to the 21st century and we now have real time bidirectional communication between virtually anyone, anywhere in the world.
When you have an unpredictable event like the death of a societal icon or the launch of a new service that has the potential for extremely rapid adoption, or at the very least high traffic due to curiosity alone, it is very difficult, or practically impossible, to anticipate the real world resources needed to support the inbound demand. This is very clearly shown by the chart from Keynote Systems illustrating the availability and performance impact of this event on news websites.
Image from: http://www.datacenterknowledge.com/archives/2009/06/25/michael-jackson-news-slows-web-sites/
TMZ.com was the first news outlet to break the story of Michael Jackson's death, and consequently their site collapsed from the unexpected workload. It's hard to fault the IT team responsible for TMZ services delivery. After all, no one knew MJ was going to pass away yesterday.
So where am I going with all of this? To the clouds of course! If there was ever a real world example of where a cloud solution would have played nicely into the delivery of a service impacted by transient high-intensity workloads that can come without warning, this is it. Even a properly architected high volume application or service that is designed to handle large increases in transient load has a finite capacity. Now, what if TMZ.com had the ability to automatically spin up cloud resources and shunt the new traffic load over to the cloud during the media frenzy? It would have meant full performance and availability during the peak of traffic and provided service quality as good as their normal service levels (for the shunting, I'm a big fan of f5 gear for ADN networking). Now, they could have done this manually I suppose. When they saw the traffic coming they could have provisioned some AWS instances, got their site/content up and running, and started routing traffic through a change to their load balancers. That'll work, but it's also manual and it's going to take them time to get it all implemented. In fact, by the time they're set up their end users may have already hit a dead site and gone to one of their competitors. So what to do? Automate!

