FT, HA – What's the Difference? Print E-mail
By Dan Kusnetzky

published: Wednesday, December 19 2007

Suppliers in the virtual processing markets often discuss Fault Tolerant (FT) solutions and High Availability/Failover (HA) solutions as if they were the same things. This allows them to sell their technology into both markets and, only later, does the customer realize that either they bought more availability than they need or, even worse, got less than they bargained for.

Although similar in nature, FT solutions go beyond HA to present an environment in which failures are not seen, not merely an environment that survives a failure after a small delay during which the systems reconfigure themselves. Some suppliers of FT technology call this "fail through" rather than failover. Although I thought that these were fairly well known concepts because they've been around for well over 30 years, I was surprised to find that the distinction is still not clear to some.

While speaking with a potential client about how different forms of virtualization could address his organization's requirements, I detected that some of my comments created confusion rather than clarifying things. (As an aside, it appears that I have an innate ability to make some technology appear more complex that it really needs to be.)

Approaches to Virtualization
Virtualization technology, taken broadly, offers a number of approaches that each with increased levels of availability. Here are a few of them:
  • Access to application solutions can be virtualized. If the back end system fails, the individual using the application is connected to another system that offers the same application. More sophisticated access virtualization software may make this process automatic. Even more sophisticated products in this area will remember the state of the application and give the impression that nothing ever failed. Doing this last bit, however, usually involves other forms of virtualization. This process, while fast, is unlikely to be instantaneous.
  • Application frameworks, part of the application virtualization layer, may offer load balancing and failover capabilities. The application framework monitor, upon detecting either a failure to meet service level objectives or some other type of failure, would start the application on another machine. Once again, the process could be automatic or require manual intervention. If other types of virtualization are in use, the actual state of the application could be saved during the process. While this process may happen quickly, it is likely that individuals using the application would notice a pause or a slow-down.
  • Processing virtualization, which includes clustering, parallel processing, operating system virtualization/partitioning software and virtual machine software, may offer similar load balancing and failover capabilities to that offered by application virtualization for some or all applications on a given system. The key difference between the levels of virtualization is that application framework virtualization only virtualizes applications running in that framework. Processing virtualization makes it possible for applications, data management products or even basic system services to failover to another system. As with the other forms of virtualization, the failover process can take some time.
  • Virtualizing storage often a necessity for all of the other forms of virtualization. After all, what good is moving an application over to another system, if the data it was being processed is either no longer available or doesn't contain uncommitted updates? Storage virtualization could be implemented using special purpose software on general purpose systems or by moving the entire storage function to a special purpose storage server.


Perceived Failure is not an Option. What Do I Do?
All of these are well and good. What happens, however, when the requirement is that failures are never seen? This is the realm of FT systems. In this case special purpose, redundant hardware configurations are deployed that are run in lock-step. If one component of the system fails, the others continue working and the application does not fail. Making this work is a higher level of rocket science than any of the approaches listed above because if often involves spelunking deeply into the hardware.

Historically, FT solutions were quite expensive. After all, every component of the system had to be replicated enough times to handle all expected failure scenarios. More recent solutions – offered by suppliers such as Stratus and Marathon – are based upon industry standard systems and components. The use of off-the-shelf hardware significantly reduces the price of these solutions.

They are still more expensive than HA solutions, but if your organization stands to lose millions if your systems are down for even a few moments…or vehicles are going to crash into one another simply ruining a passenger's day…or that toxic substance your plant is processing is going to melt city center, FT is likely to help.



Daniel Kusnetzky has over 30 years of industry experience. He is responsible for research and analysis on open source software, virtualization software and system software. He examines emerging technology trends, vendor strategies, research and development issues and end-user integration requirements. In the past he was executive vice president for Open-Xchange, Inc., and Program Vice President of System Software Research for International Data Corporation.
 
< Prev   Next >