Virtualization's Impact on IT Operations - Part Two
Virtualization's Impact on IT Operations - Part Two
By Kevin Lees
published: Tuesday, July 01 2008


ImpactonIT_2008_06_17.jpg

In the first article of this series, I explored some advantages virtualization provides to IT Operations. Those advantages were seen through the eyes of someone responsible for IT system operations in a pre-virtualization environment at an ASP, namely me, and focused on how these advantages would positively impact Service Level Management. In this, as well as the next, and final, article in this series, I'll look at some of the challenges virtualization presents to IT Operations during its pursuit in relieving the seemingly "Atlas Carrying the Weight of the World on His Shoulders"-like pressure of Service Level Management. The present article addresses what I see as some of the underlying Service Level Management challenges presented by virtualization as well as the tools available to begin addressing them. The last article in the series will look at addressing additional virtualization challenges presented to IT Operations from an ITIL perspective, thereby from primarily a process view.

 

To begin, what are the overarching challenges virtualization thrusts upon IT Operations and Service Level Management? First there is the added complexity of virtualization. Now, in addition to physical servers, IT Operations must contend with multiple virtual machines per physical server and their required resources, not to mention keeping track of resource pool constraints across multiple physical servers. And, not only is the server infrastructure affected by virtualization, but what about storage virtualization (not to mention the attention I/O virtualization will require in the near future)? Second, there is the increasingly dynamic nature of virtualization. Between the ability to manually move "live" virtual machines between physical servers and having them automatically moved in response to resource needs or physical server problems, how can you monitor and manage something if you don't even know where it physically resides? Additionally, how does the virtual infrastructure relate to and impact the rest of your IT infrastructure? Lastly, there is the challenge created by the ease with which virtual machines can be generated and deployed. You want to be responsive to changing business needs; you may even want to put control of virtual machine deployment in the hands of the department whose business function it specifically provides (don't panic; smelling salts help!) but what havoc might that wreak on your infrastructure? How can IT Operations begin to address these challenges?

 

Let's start by defining four key aspects of Service Level Management: proactively planning to prevent problems before they occur; quickly and efficiently provisioning systems into service in response to changing business needs; monitoring the infrastructure to detect problems before the users do; and troubleshooting infrastructure performance issues to minimize unplanned downtime. While each of these exists when dealing with a purely physical server environment, they are compounded by virtualization. Let's address the "why" as well as the tools available to begin addressing each, in turn. Before I begin, though, I'd like to provide a disclaimer. While I try to be virtualization vendor agnostic, many of the tools I'll identify in this article provide solutions for a VMware-based environment only. Some do address other vendors' solutions (Citrix's XenServer and Microsoft Virtual Server specifically), but those who don't currently say they will support other virtualization vendor technologies as the market need arises. That said, I'll identify extra-VMware vendor support where appropriate.

 

When I refer to proactive planning I'm not talking about the massive planning efforts involved in consolidating an entire datacenter. Don't get me wrong, these large scale planning efforts are obviously important. And, fortunately there are excellent tools becoming available on the market to plan an entire datacenter consolidation. For instance Cirba's Data Center Intelligence software which does a detailed analysis of the entire environment taking into account not only workload constraints but technical and business constraints as well. But the proactive planning I have in mind is that day-in, day-out, project-in, project-out server virtualization planning facing IT Operations. The type of planning (aka capacity planning) needed to answer the question of which ESX or Xen server on which to place that new business application virtual machine and still maintain needed resource reserves. The kind of planning that, if done consistently, prevents resource problems and unplanned downtime from occurring at the least opportune time (not that there is EVER an opportune time).

 

Wouldn't it be nice if you could perform "what if" scenarios to "test" virtualization infrastructure changes to optimize a virtual machine's placement so as to predict the best use of available resources prior to implementation? Well, you don't have to wait any longer as tools like Profiler from Tek-Tools Software, Inc., HP's Insight Dynamics - VSE, Akorri's BalancePoint Performance Dynamics ModelingTM, and BMC's Performance Assurance solution provide the planning assistance needed to understand the impact a virtual machine will have on resource capacity. Using real-time and historical data, these tools help determine the best placement for your new virtual machine(s) workload. But be careful; while these tools address workload impact on the resources of a single physical server hosting virtual machines, they do not yet take into account cross server resource pool constraints.

 

As to the second key aspect (in my opinion, at least) of Service Level Management - quickly and efficiently provisioning systems into service in response to changing business needs virtualization adds the challenge of being able to quickly and efficiently provision virtual machines perhaps a bit too easily. Once you have established templates, or by making use of virtual machine cloning, vendor tools like VirtualCenter and XenCenter make controlling virtual machine provisioning a concern to say the least. If the decision has been made to let a business unit provision their own virtual machines, many believe it may become IT Operations' bête noire.

 

I believe the resulting concern with unrestrained virtual machine provisioning leading to the slow death of IT Operations due to virtual sprawl should be addressed via process (Change Management specifically, which I'll come back to in the third article in this series). Vendors recognize the importance of wrapping a process around provisioning and tools are making their way to the market to address this. At least three tools are available to control the provisioning process: VMware's LifeCycle Manager, BMC's Virtualization Manager, and Opalis' Integration Server. LifeCycle Manager and Virtualization Manager are pretty much self-contained in that all of the required tools, along with a workflow engine, are contained in the product. If you prefer to integrate with third party or existing tools, Opalis' Integration Server provides a middleware approach of sorts which allows you to integrate third party products with Opalis' workflow engine to create an automated provisioning process. Most importantly each incorporates an approval step prior to a virtual machine being provisioned. With these tools you could, theoretically at least, allow a business unit to manage the lifecycle of their own virtual machines, within the resource constraints defined by IT Operations, without the fear of virtual sprawl running amok in the datacenter.

 

Ok, I've attempted to identify tools to address the Service Level Management challenges of planning and provisioning, what about monitoring the infrastructure to detect problems before they affect the users? There are certainly adequate tools, typically those provided by the vendors, to monitor the virtual infrastructure, but what about the challenge of integrated monitoring of the entire infrastructure of which the virtual infrastructure is but a component? At a bare minimum, such an integrated monitoring tool needs to provide event correlation, but what about the added challenge presented by the dynamic nature of virtual machines movement? Knowing an event has occurred is valuable, but how about subsequent troubleshooting when the location of the virtual machine generating the event is fluid? Tracking virtual machines that can be easily moved manually is one thing but add to that those virtual machines that are dynamically moved, as with VMware's DRS product, and troubleshooting an event could become even more time consuming.

 

For integrated, "single-pane-of-glass" event monitoring / correlation of the physical and virtual infrastructures, vendors are again rising to the challenge. All of the "big name" vendors have enhanced their operations management solutions to include the virtualization infrastructure. This includes: HP's Operations Manager; IBM with Tivoli Monitoring for Virtual Servers; BMC Software with Performance Manager for Virtual Servers; Computer Associate's UniCenter; and Microsoft's System Center Operations Manager. All support VMware-based environments. In addition, BMC supports Citrix and Microsoft's virtualization offerings. Aside from the big-name operations management vendors, other vendors with integrated event management of the physical and virtual infrastructures include Tek-Tools with their Profiler product, Avocent's DSView3 and, for a good, lower-cost solution, Woodstone's Server's Alive has added VMware-support. To address the fluidity virtual machine placement, HP and Microsoft can dynamically track virtual machine locations via nWare's SPI for VMware.

 

The final, key aspect of Service Level Support I mentioned was troubleshooting performance issues in an infrastructure to minimize user dissatisfaction or, in a worst case scenario, unplanned downtime. This obviously applies to a purely physical environment, but is complicated in a virtualized environment by the existence of multiple virtual machines on a physical server, contending for shared storage, memory, CPU and network resources. Multiply this by having a service supported by multiple virtual machines on multiple physical servers sharing, perhaps, virtualized storage and you begin to see the performance troubleshooting challenge.

 

In my experience, performance troubleshooting, like programming, is an art. While it can be codified to an extent, those people who are really good at performance troubleshooting are not only very experienced, but seem to have some innate ability that the rest of us mere mortals can only wish for. Wouldn't it be nice to be able to easily pinpoint performance bottlenecks in an end-to-end virtualized environment, i.e., virtual machine to physical server to storage, so the rest of us could experience what it feels like in that rarefied air? Welcome to the world of cross domain analysis. In this context, cross domain analysis refers to applying advanced analytics to bridge the gap between the performance and availability of virtual, and associated physical, infrastructure components, and resource capacity consumption.

 

Employing such a capability means that the tool can determine where a performance bottleneck, or resource contention, is occurring in the virtual machine / physical server / storage path of the infrastructure; providing for quicker problem identification and resolution. Vendor's providing cross domain analysis tools include Akorri with BalancePoint Cross-Domain AnalysisTM, Tek-Tools Software with their Profiler Software Suite, and BMC's ProactiveNet Analytics which is part of their Services Assurance family of solutions. This would seem to be a "must-have" capability in any large scale, virtualized production environment.

 

In closing, as the placement of virtual machines on physical servers can be fluid, so is the Service Level Management tool market. What I've attempted to provide is merely a snapshot in time of currently available tools. Rest assured that, given the speed with which virtualization is becoming an IT mainstay, planning and management tools will be brought to market at an increasing (if not exponential) rate. So far, the market seems to be dominated by the traditionally big players - HP, IBM, CA, BMC), but the newer players like Tek-Tools Software and Akorri portend what I believe will be a plethora of solid, focused management tool vendors.

 


Related Links: 

PMP'n Part One, PMP'n Part Two, Impact on IT Operations - Part One

 

 

 

kevinlees_2008_05_28_thumb.jpg Kevin is the principal consultant at Premier Project Management, LLC, where he specializes in IT infrastructure architecture development as well as planning and managing IT infrastructure, virtualization, and data center consolidation/relocation projects. He is a Project Management Professional and VMware Certified Professional with 26+ years of technical, management, and consulting experience in systems integration, project management and IT operations. Recent engagements include performing an enterprise-wide IT Infrastructure & Operations assessment as well as planning and managing a multi-datacenter consolidation / relocation, using virtualization, in the publishing industry; managing the implementation and operational "go-live" of two e-mail platforms for an international e-mail ASP; and providing technical project management and virtualization services for the assessment phase of a multi-datacenter consolidation / relocation project in the on-line, e-mail marketing service provider space. Kevin can be reached at kevin.lees@premier-pm.com This e-mail address is being protected from spam bots, you need JavaScript enabled to view it .

 
 

 

Comments
Search RSS
Please register as a member of Virtual Strategy Magazine to comment.Click here to register.

3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."