Q&A with Steve Francis of LogicMonitor
VSM: LogicMonitor positions itself as a solution that helps companies extract intelligence from their tech infrastructure. What sorts of operational or business intelligence are LogicMonitor customers extracting from their virtual infrastructures?
SF: In addition to providing in-depth performance monitoring of virtual environments, LogicMonitor gives businesses a flexible platform to collect, graph, and alert on virtually any type of data that might be useful to their company. For example, the VP of client services at a managed service provider used to require a 20-minute meeting every morning with all 28 of his techs to discuss the status of tickets and who’s working on what. It took each tech 10 minutes to prepare data for that meeting.
Now he uses LogicMonitor to pull that data from their ticketing system and display the status of all tickets on a dashboard. He now has access to all that operational data on-demand, eliminating the need for a meeting, saving 14 hours of cumulative staff time every day, and providing insight into technical staffing needs.
Another example is a popular Web 2.0 company that uses LogicMonitor to track not only the performance of their infrastructure, but also key metrics such as which features in their web-based application are the most frequently used. This intelligence helps their developers determine where to focus their efforts, and which features may need improving.
VSM: What are the essential things to monitor in a virtual environment?
SF: The essential things to monitor are the resources in contention, which are CPU, memory and disk performance in operations per second. You need to have a top 10 overview of each of those things by virtual machine on a physical server.
VSM: What are the unique challenges of monitoring virtualized environments vs. physical environments?
SF: You can use the exact same monitoring and methodology to monitor the memory on a system once it’s been virtualized. The fundamental difference is that you need to monitor the memory usage of the virtualization platform -- such as the ESX host. Following the logic, you need visibility into which virtual machine is using resources on the virtualized platform, which is a completely different kind monitoring. You’re not talking about the operating system’s memory usage, you’re talking about how much memory 20 operating systems are using when they’re all running as VMs on the virtual platform. Then it gets even more complicated, because your virtualization platform can trick the operating system as to how much memory it has available through the use of ballooning.
When monitoring the hardware on the physical layer you've got to make sure you’d catch something like a failed fan or power supply, or a bad disk in a RAID controller. If your standard monitoring tool is capable of monitoring your Dell Server hardware, it wants to talk to the Dell Server agent that’s installed on the Dell Server hardware via SNMP. But you can’t install a Dell Server agent on a Dell Server that’s running ESX software because there’s nowhere to install it. So you can’t talk to the Dell Server agent because you can’t run it. You have to talk to the ESX server using a completely different protocol.
You can monitor your operating system and your applications that were running on your operating system using the legacy tools you used before you virtualized, but they may not be able to tell you the root cause of the performance issue since the root cause may be in the virtualization layer or in the storage layer that the virtualization is using. Legacy tools have no visibility into that level of essential virtualization monitoring.