Sometimes You Just Need to Break the Rules in Pursuit of the ULTIMATE VIRTUAL INFRASTRUCTURE
Sometimes You Just Need to Break the Rules in Pursuit of the ULTIMATE VIRTUAL INFRASTRUCTURE
By Norm Hebert
published: Thursday, August 12 2004


I was recently contracted to design a highly available virtual infrastructure for a Fortune 100 company. The enterprise had the typical mix of several hundred Windows and Linux servers, which were dispersed across two data centers that were separated by approximately six miles of geography. This company had SAN controllers at both locations which were connected using trunked 2GB fiber circuits and had the capability to replicate VMFS partitions between these two storage controllers using SRDF. This company also owned lots of dark fiber, which was pulled and in place between the two facilities.

After absorbing all this information, I realized that this customer held the potential for me to design what I considered to be the Ultimate Virtual Infrastructure. We could create one VirtualCenter server farm with ESX servers hosted in both data centers. We could then use SRDF to replicate the VMFS volumes between storage controllers in an exposed LUN / hidden LUN configuration.

Thereafter, any of the 500 servers to be virtualized could be migrated, using VMotion, to any of the ESX servers in either data center, providing for the ultimate in flexibility, resource utilization and disaster recovery/business continuity. In the event of the total failure of either data center, the storage controller at the surviving data center would expose the hidden replicated LUNs to the ESX servers at the surviving data center. The VMs contained in those replicated VMFS’s could be seen and brought back online quickly after a manual LUN rescan of each surviving ESX server. However they would run at diminished performance because the surviving ESX servers would now be called on to run twice as many VMs as they normally would run in a non-failed over condition.

The only piece of the architecture that needed to be put into place to make the first 60 servers of this Ultimate Virtual Infrastructure a reality would be the stretching of a single IP subnet, used for network communications by the VMs, between facilities. Dark fiber was already in place and was measured at 37,000 feet, well within the latency requirements of single mode fiber. Naively, perhaps, I thought that the costs of purchasing a couple of GigE single mode fiber switches and lighting up just two strands of this dark fiber, aggregated in a bond for high-availability purposes, would be negligible compared to the bang-for-the-buck that my Ultimate Virtual Architecture would yield the company.

This is where I ran into problems with the Company’s Network Communications team. Traditionally, best practice in network design says that we use Layer 2 switching when communicating within a geographic facility and Layer 3 routing to communicate between facilities. What I was asking to do was Breaking the Rules. I asked why it was that since extending SANs between geographic locations was now the hottest thing since sliced bread, how could extending Ethernet subnets be such a definite no-no taboo?

By the time we were done with our meeting, the NetComm team was insisting on the purchase of four Cisco 6500 core switches to extend the subnet across the two data centers, costing in the neighborhood of $300,000! Based upon my experience with the old non-routable DECNET protocols in the early 1980’s (is anyone here like me old enough to remember having to cluster bridge LAT and MOP between facilities?) I understood that this inter-site L2 communication would need to be isolated from the primary routed corporate infrastructure to prevent the potential for spanning tree loops. But in this particular case, we only needed two GigE ports scalable to perhaps 8 ports max at each data center to make the Ultimate Virtual Infrastructure” a reality. Thus, for a truly highly available configuration, we would need to purchase four switches - two for each data center for fault tolerance purposes. My analysis determined that this network enhancement could easily be implemented for around $50,000 using Cisco 3750s.

Despite their outward hostility to the concept, it turned out that the NetComm team secretly loved the idea of implementing an isolated inter-data center L2 network. They let it slip that they had many other applications clusters that could also benefit from inter-site virtual server failover between cluster members which they could then collocate and geographically isolate at the two data centers. They just wanted our consolidation/virtualization project to absorb all the fixed costs of the spanned network so they could piggyback the other clusters onto our spanned L2 infrastructure at marginal cost.

This secret longing was the root of their insistence on purchasing the heavy-duty Cisco 6500s. When management decided that we would not be getting an additional $300,000 added to our consolidation/virtualization project budget, I was relegated to designing and delivering a two farm model, using twice as much server hardware than would otherwise be necessary. This architect’s dream of creating and delivering the Ultimate Virtual Infrastructure crashed and burned - for the time being at least. So for me, at this point, the search goes on…


To submit a question for Norm to answer next month, please forward it to: AskNorm@Virtual-Strategy.com.

Norm Hebert is employed as the chief Windows Network Architect and Server and Storage Virtualization Specialist by Certified Network Consultants of Nashua NH. Mr. Hebert has 6+ years of experience with MSCS, is fully Microsoft and VMware certified and can be reached at: Norm@CertifiedNC.com.
Comments
Search RSS
Please register as a member of Virtual Strategy Magazine to comment.Click here to register.

3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."