Why Does the Sad Passing of Michael Jackson Prove the Need for Cloud and Dynamic Compute?
Why Does the Sad Passing of Michael Jackson Prove the Need for Cloud and Dynamic Compute?
By Chris Knowles
published: Friday, June 26 2009


Why Does the Sad Passing of Michael Jackson Prove the Need for Cloud and Dynamic Compute?
 

This has been one heck of a sad week for the entertainment industry. Ed McMahon, Farrah Fawcett and Michael Jackson passed away this week. Whenever significant cultural events like this occur, there is an explosion in communication among people, wanting to know what happened and further discuss it with their peers. In the past, this would have been isolated to talking with your neighbors, family, and friends, either in person or over a traditional POTS line. Fast forward to the 21st century and we now have real time bidirectional communication between virtually anyone, anywhere in the world.

 

When you have an unpredictable event like the death of a societal icon or the launch of a new service that has the potential for extremely rapid adoption, or at the very least high traffic due to curiosity alone, it is very difficult, or practically impossible, to anticipate the real world resources needed to support the inbound demand. This is very clearly shown by the chart from Keynote Systems illustrating the availability and performance impact of this event on news websites.

 

Website Graph

Image from: http://www.datacenterknowledge.com/archives/2009/06/25/michael-jackson-news-slows-web-sites/

 

TMZ.com was the first news outlet to break the story of Michael Jackson's death, and consequently their site collapsed from the unexpected workload. It's hard to fault the IT team responsible for TMZ services delivery. After all, no one knew MJ was going to pass away yesterday.

 

So where am I going with all of this? To the clouds of course! If there was ever a real world example of where a cloud solution would have played nicely into the delivery of a service impacted by transient high-intensity workloads that can come without warning, this is it. Even a properly architected high volume application or service that is designed to handle large increases in transient load has a finite capacity. Now, what if TMZ.com had the ability to automatically spin up cloud resources and shunt the new traffic load over to the cloud during the media frenzy? It would have meant full performance and availability during the peak of traffic and provided service quality as good as their normal service levels (for the shunting, I'm a big fan of f5 gear for ADN networking). Now, they could have done this manually I suppose. When they saw the traffic coming they could have provisioned some AWS instances, got their site/content up and running, and started routing traffic through a change to their load balancers. That'll work, but it's also manual and it's going to take them time to get it all implemented. In fact, by the time they're set up their end users may have already hit a dead site and gone to one of their competitors. So what to do? Automate!

 

Sounds easy, but we all know IT automation is complex, costly, and out of scope for most SMBs and some enterprises. Or is it? This entire scenario could have been easily automated with readily available and cost effective solutions already on the market. In this case, up.time 5 (our systems management solution) has a full bi-directional integration with VMware Orchestrator. If you are a VMware shop, you get Orchestrator for free with vCenter Server. If you are not familiar with Orchestrator, you can check it out here. Essentially, Orchestrator is a policy based workflow automation tool that you can use to build automated scenarios to perform well pretty much anything. Orchestrator has the concept of plug-ins that provide Orchestrator with the know-how for specific vendor technologies to directly interact with them. up.time is the first Systems Management solution to deeply integrate with Orchestrator and provide this type of functionality.

 

So how does this play into the TMZ.com cloud scenario? Well, it goes something like this:

  1. End-user experience for the website is being monitored by the logical service address using the HTTP service or WATM monitor. (www.mynewssite.com)
  2. When the end-user experience begins to suffer or servers start to indicate they are becoming overloaded, a workflow can be automatically triggered to avoid any end user incidents that may occur due to insufficient resources. 
  3. With the automation set up, the cloud is temporarily being used to handle all excess capacity while the website is running hot. The automation in this case would not only include the additional virtual resources, but also the monitoring of the newly spun up capacity and applications. We're now sending traffic to our AWS cloud without anyone ever having had to do anything other than the initial Orchestrator configuration.
  4.  It gets even better. So, what happens as the traffic slowly decreases back to normal level? The monitoring of the newly created resources notices that the cloud is no longer needed and the private infrastructure can now handle the load. It triggers automated workflows to decommission those cloud resources, saving money by only using the cloud when needed and avoiding any sprawl issues.
  5. Lastly, the IT manager and system administrators have been receiving emails and alerts to let them know that these automated actions were happening, so they can sit back and watch as their IT Infrastructure evolves to handle whatever traffic comes their way.

 

Pretty cool stuff. Hey, did we just make Cloud move from "buzz word" to real business value!? I think so. So, with a little up front configuration you can implement ‘Automated Incident Avoidance' to keep your services running when they are faced with potential unforeseen transient workloads. And the best part is, this is only one example out of literally hundreds (dare I say thousands) of ways you can automate your infrastructure management to ensure you are operating at the highest possible levels of efficiency both from a technology and a resource standpoint.

 

 

Chris Knowles

 

 

Chris Knowles is a Solutions Architect at uptime software, the makers of up.time.

 

 

 

 

Comments
Search RSS
Please register as a member of Virtual Strategy Magazine to comment.Click here to register.

3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."