Server Virtualization Storage Based Performance "Gotcha"
Server Virtualization Storage Based Performance "Gotcha"
By Marc Staimer
published: Monday, June 09 2008


Marc_Staimer_2008_06_10.jpg
 

Server virtualization has become an irresistible force sweeping into the world's data centers. With compelling cost and management savings from server consolidation, server virtualization's future sure seems secure. Or is it? Everything may not be so completely perfect in the world of virtualized servers.

It is not uncommon for system administrators to find stunning application performance degradation when moving from the physical world to the virtual one. Invariably the application drop off shows up after the pilot has moved to production. There is significant frustration in the efforts to fix it. The problem and the answer are within the SAN storage.

There are four definitive bottlenecks that can and will crater virtualized server application performance if not managed correctly. They include:

  1. Oversubscription within the virtualized server.

  2. Oversubscription within the HDD and target storage system ques.

  3. Oversubscription within the SAN fabric.

  4. Oversubscription within the target storage ports.

All four revolve around the concept of oversubscription. Oversubscription means that the amount of potential bandwidth assigned to a given port or device is much greater than the bandwidth available. Oversubscription takes advantage of statistical probability. It is highly unlikely that all of the users or applications using that bandwidth will do so at exactly the same time. This allows for much higher utilization of the assets and significant cost savings from fewer idle assets. This makes huge economic sense and has been used everywhere for hundreds of years including hot bunking in naval vessels, traditional phone systems, and the Internet. It is a sound concept.

The downside of oversubscription is the risk that users and applications will concurrently attempt to use all of the assigned capacity resulting in much reduced performance. The risks are generally low, if there is not too much oversubscription. And that's the rub. The cumulative multiplying effect of each level of oversubscription dramatically increases the probability of that downside risk. A deeper examination of each of these oversubscription bottlenecks shows how.

Oversubscription within the virtualized server

Oversubscription at the server is how server virtualization works. Too much oversubscription occurs when there are too many guests and applications competing for those server resources. One factor that complicates just how many is too many is the resource intensity of each application.

A second factor is the hypervisor's storage virtualization layer. This is where the LUNs (SCSI logical unit numbers) assigned to the physical server are carved up by the hypervisor into virtual LUNs. The assigned target LUN in a traditional SAN storage system is tied to a specific number of drives in a RAID group (usually no more than 8). Whereas the physical world has unique LUNs for each server, the virtual server world has multiple virtual machines accessing the same LUN (meaning same disks) at the same time. This is compounded by oversubscription at the ques.

Oversubscription within the HDD and target storage system

Each HDD has a limited que depth that allows multiple commands to stack up before a busy signal is sent back to the storage system. The storage system itself also has a limited que depth before it sends a busy signal back to the application. The que depth per Fibre Channel or SAS drive is 256 to 512. The que depth per SATA drive is at most 32 and more often than not 0 (32 requires command queuing in the disk controller which is atypical.)

This means that LUN drawn from SATA disk RAID groups are far more likely to have busy contention than RAID groups with SAS or Fibre Channel disks. Even then there can be disk contention if there is a high number of IO or throughput intensive guests on the hypervisor.

Oversubscription within the SAN fabric

SANs are by design oversubscribed. Best practices call for an average of 8:1 initiators from servers to target ports on storage. Higher IO or throughput intensive application servers require a lower oversubscription ratio. Lower IO or throughput intensive application servers can have a much higher oversubscription ratio.

When physical application servers are consolidated through server virtualization and if the SAN is not re-architected to reflect virtual server oversubscription, there will be a much higher probability of application performance cratering. Poorly engineered SAN fabric oversubscription will lead to significant fabric blocking.

Oversubscription within the target storage ports

Just as too much oversubscription within the SAN fabric can cause blocking that substantially reduces application to storage performance, so too can too much oversubscription to the target storage ports.

Conclusion

Oversubscription is not a bad thing and in fact is very useful in increasing asset utilization and reducing costs. Unfortunately, just like a cliché, too much oversubscription leads to bad consequences.

My next column will discuss methodologies and solutions that help alleviate these bottlenecks.

 


Related Links:

Virtual Thread Server Virtualization

 

 

 

Marc_Staimer_thumb.jpg

 

Marc Staimer is president and CDS of Dragon Slayer Consulting in Beaverton, Oregon. He is widely known as one of the leading storage market analysts in the network storage and storage management industries. His consulting practice of 6 + years provides consulting to the end-user and vendor communities. Most of his consulting is in the areas of strategic planning as well as product and market development. Staimer's 23 years of marketing, sales and business experience in the storage, software and systems industries, combined with his years of research into the MIS community, give him unique business, systems and market expertise.

 

 
 

Comments
Search RSS
Please register as a member of Virtual Strategy Magazine to comment.

3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."