What You Need to Know about Virtualization and Availability
What You Need to Know about Virtualization and Availability
By Jerry Melnick
published: Friday, November 02 2007





Virtualization brings many IT benefits: better server consolidation and utilization, lower capital and operating expenses and greater flexibility to meet business needs among them. But virtualization also brings with it some unintended consequences. One of the unintended consequences is that virtualization dramatically increases the need for rock-solid availability. Because server consolidation can result in the server becoming the single point of failure for multiple applications, the implications of downtime are much greater.

Today’s virtualization technologies are particularly useful for protecting applications from planned downtime—outages necessary for administrative purposes. Using live migration technologies, companies can move virtual machines and their running applications between physical machines without disruption. Examples of planned migration technologies include VMware VMotion and Citrix XenMotion.

But protecting virtual environments from unplanned downtime is a different matter. In many cases, virtual environments employ traditional clustering and failover techniques, which use rudimentary heartbeat pings to check the status of a virtual machine. This approach suffers from several drawbacks:
  • Clustering and failover add cost and complexity to the environment, requiring manual configuration, setup, scripting and testing to define the appropriate actions to take in case of failures. This additional administrative complexity can introduce errors, contributing to availability issues.
  • Heartbeat pings are unable to reliably determine the health of a virtual machine and may not distinguish between I/O path failures, server failures, and lack of system resource. In some cases, these limitations may result in unnecessary or false failovers. In other cases, discrete storage or network device outages are not identified as failures and the system does not fail over.
  • The failover process is far from certain; it assumes that the administrator has configured the standby system appropriately for the application and has maintained that configuration. If the target system is not configured appropriately, then when a failover does occur, the application or virtual machine is inoperable on the standby system, causing a "failed failover." Given the sense of uncertainty, some refer to this approach as "ping and pray."

Continuous Virtualization
So how do you protect against failures without the interruptions and hassles of failovers? With fault tolerant-class availability. What is fault tolerant, or FT-class availability? With FT-Class availability instead of failing over every time a failure occurs, virtual workloads are protected by redundant virtual machines so the workloads ride out a whole category of common system and network failures. No loss of data. No failover or restart.

Think fault tolerance-class availability is too expensive or too complex? In early 2008 a new category of software-only FT-class availability will be available that sits on top of the hypervisor and runs on standard x86 servers. Organizations will be able to achieve fault tolerant virtual machines without expensive hardware or costly modifications of their applications.

In addition to providing a break through in reliability, this FT-class availability software will be characterized by next generation automation. It will completely automate setup, configuration, fault detection and policy management. Automated setup and configuration will eliminate the manual configuration today’s availability solutions for virtual machines require. Point to the VM you want to protect, and the software will do the rest, fully protecting the VM without user intervention. And multiple virtual machines will be treated as one, so the administrator will only have to manage one environment for each application, not two separate environments as is the case with today’s availability offerings. Embedded policy management in this new class of software will automatically handle all system, network and disk I/O failures.

What FT-Class Virtual Machines Means for Business
Fault tolerant-class virtual machines will allow companies to gain the benefits of server consolidation without the server being the single point of failure for multiple applications. FT-class availability will bring the benefits of virtualization to business critical applications that companies were afraid to virtualize previously because of reliability concerns. And because this new approach will reduce the cost and complexity of protecting virtual machines, applications that aren’t protecting today (but should be) will now be protected. The end result, server virtualization will be adopted through a much broader class of applications and a much broader range of organizations.

Jerry Melnick is Chief Technology Officer at Marathon Technologies Corporation, the leading provider of automated, fault tolerant-class availability solutions for virtual and physical environments.

Jerry is responsible for Marathon’s technology roadmap that is driving the convergence of high availability and virtualization technologies. Under Jerry’s leadership, Marathon is developing the first-ever fault tolerant software solution for virtual environments. Built in partnership with XenSource, this cutting edge product was awarded “Best New Technology” at VMworld 2007.

Before joining Marathon, he held executive positions at PPGx, Inc.,and Belmont Research as well as management and technical roles at Digital Equipment Corporation, where he was responsible for the development and deployment of mission-critical platforms to support enterprise-computing environments. He led a variety of system and product development efforts in the area of operating systems, network communications, database systems, and computing languages.

Jerry holds a BS in chemistry from Beloit College and has completed graduate work in computer engineering and computer science at Boston University.

For more information on Marathon visit www.marathontechnologies.com.