The Impractical Reality of Deploying Virtualization in Real World Environments
The Impractical Reality of Deploying Virtualization in Real World Environments
By Nicholas Filippi
published: Friday, September 25 2009


The Impractical Reality of Deploying Virtualization in Real World Environments
 

Overview

 

Virtualization is nearly at the top of every large enterprise's key IT initiatives list. Whether to increase server utilization, promote green data center practices or for countless other reasons, upper management within most corporate environments, continue to drive the requirement for virtualization down through the organization as well as externally to individual solution vendors. In fact, the demand is so high that for applications not targeted for migration to a virtual environment, administrators and vendors alike are responsible for providing formal justification for why this is the case.

 

For the business-critical messaging infrastructure, despite the continued growth in adoption of virtualization within large enterprises, there remain several challenges inhibiting full migration to a virtualized environment. As new management tools, processes and procedures, and product functionalities develop, the practicality of such a migration is likely to change, but at present, still suffer from operational and technical limitations. 

 

In working with several enterprise clients over the years to explore virtualization as a potential deployment option, I've witnessed proven success in deploying much of the messaging infrastructure in a virtual environment, but see the true exponential growth of this model as something yet to be realized. 

 

Operational Challenges

 

Maintaining High Service-Level-Agreements (SLAs)

Business-critical applications such as email are strictly measured by their service level agreements (SLAs), which set extremely high standards for reliability, availability and recovery. Previously, much of the ownership and responsibility for these requirements was placed on the appliance and application vendors to prove both fault tolerance and disaster recovery within their solutions.  Introducing virtualization as a new platform for these applications raises the complexity both in terms of the increased disaster scenarios and points of failures, but also in the operations and coordination if and when such a failure might occur. The truth in many organizations is that this workflow, process and failure analysis has yet to be conducted. Until this is fully defined, implemented and matured within enterprises, the achievement of such high SLAs for business-critical applications has yet to be proven. As a result, most business critical applications are not among the initial target list for deployment in a virtual environment.

 

Recoverability & Problem Management

The deployment, configuration and management of applications in a virtualization environment cross several domains and functional organizations within a large enterprise. In addition to the application owners themselves, this deployment model requires the close interaction between the virtualization team, enterprise storage team, and in some cases, the security team. While the adoption of virtualization has forced these previously independent groups to work more collaboratively, there remains a large skepticism of how efficient troubleshooting, diagnosing and fixing failure cases can be. Under any crisis scenario, where the objective is first to isolate variables, it is often a struggle when these variables cross technical, functional and political boundaries within the organization. 

 

Change Control Management

One of the largest fears of deploying a virtualized environment is the gross proliferation of virtual machines being brought online, commonly referred to as "VM Sprawl." Due to the relative ease with which virtual machines can be brought up, many administrators are concerned that more machines will be added than can actually be managed, controlled and secured.  To prevent this, many organizations have implemented strict change control processes and procedures, replicating the current data center change control for physical machines. So while many of the marketed benefits of virtualization center around the ease with which machines can be initialized and administered (in some cases without the need for additional rack space, power and cooling), the reality of this benefit is unseen in many large enterprises due to the implemented process overhead put in place.  Without the theoretical benefits of easier administration, the motivation for application owners to perform such a migration is severely stunted.

 

Enterprise Deployment Cost-Structure and Charge Back Models

The traditional cost structure and charge back model within large enterprise data centers had, in most cases, simply been a function of the number of physical servers, rack space, power and cooling that a given solution required. Even with more complicated solutions that required integration to storage area networks (SANs) and enterprise backup systems, the model remained relatively simple and well known. Virtualization, where machines can no longer be measured by physical servers/space, power and cooling requirements, and for which now have to take into consideration the overhead of the virtualization and storage administration teams, the cost and charge back models must be re-analyzed and rebuilt. While smaller enterprises have dealt with this by deploying virtualization in silos based on applications, departments and other logical grouping, this has been regarded as a model that is not scalable.

 

Technical Challenges

 

Practicality of Live Migration for High Availability

One of the most significant marketed benefits of virtualization has been the ability to perform live migrations of virtual machines, significantly reducing (if not eliminating) downtime due to hardware failures or even scheduled maintenance windows. While this capability is unmatched in the physical server world, the technical requirements to enable this remain high. The two most common challenges are the requirements for a shared storage solution and direct compatibility between CPU instruction sets on all physical servers under management. 

 

The first challenge and concern is the potential performance impact as a result of a virtual machine leveraging system resources from different physical servers, a significant change from the traditional operating models where applications leverage mostly local resources for all functions. One requirement to enable live migration is that the logical hard drives physically reside on a shared storage device recognized by all physical hosts, most typically an enterprise SAN. This is required such that the functional migration of machines between hosts only transfers the memory tables and CPU instruction sets, avoiding the time-consuming low-level drive copying that is all too familiar for most administrators. As one might expect with a machines hard drives not being on the same physical system in which the machine is running, there is concern over the I/O performance of such a machine, both in terms of the best and worst case throughput metrics where multiple machines may be reading/writing to the same shared storage over a saturated network.