Q&A with Zohar Gilad of Precise Software
|
VSM: How does cloud computing require different strategies for monitoring, troubleshooting and risk management? |
ZG: Cloud and virtualization technologies introduce new challenges to IT management on a number of fronts. First, there is competition for shared resources, such as CPU, memory, and particularly storage access, which creates new performance bottlenecks. Second, due to the dynamic nature of most cloud environments, bottlenecks are both harder to predict and harder to isolate. Third, there is still a lack of best practices and industry experience, as relates to the overhead and scalability of hypervisors, effective memory management, and the impact of choice of CPU families, among other areas.
This all creates a “perfect storm” for data center and application owners, if they are not vigilant. IT needs to adjust the way it monitors and troubleshoots applications in the cloud to track actual production transactions and the guest and host servers, to gain an integrated and complete performance picture.
|
VSM: What do CIOs and other IT managers worry about most when it comes to moving applications and services to the cloud? What affects users most and how? |
ZG: The analyst firm IDC asked exactly the same question in a recent survey. “Data security” was the top concern (for both public and private cloud), followed closely by “application performance” and “application availability” as the second and third concern. This explains why many IT departments have only deployed non-production or non-mission-critical apps to the cloud, so far. But that is changing.
|
VSM: Are enterprises doing anything specific today to mitigate their risk in the cloud? |
ZG: Not enough. Due to the immense pressure to cut costs, many IT shops move to the cloud quickly, and later realize the risks when they are already midway through a project. A common misconception is that “guaranteed SLAs” by a cloud or hosting vendor will mitigate risk. This premise is flawed for two reasons:
- Most SLAs are simple “server metrics” (such as “CPU utilization”, or “buffer hit ratio”) which don’t say anything about the actual business transaction performance. Too often all server metrics look "green”, yet the business is choking from transaction bottlenecks.
- No matter what cloud providers promise, ultimately it’s the responsibility of corporate IT to keep the business up and running.
|
VSM: Should your strategy be different when you're talking about private versus public cloud infrastructure? |
ZG: Since corporate IT is on the hook to guarantee business SLAs, they must understand what the end-users are experiencing. That applies to both private and public clouds. Of course, for private clouds, there is more direct control. For public clouds, IT needs to leverage its relationship with the cloud provider to ensure they understand the needs for high performance and availability, and can measure and manage accordingly.

