Cross-Domain Analytics for the Virtualized Data Center By Mike Matchett published: Wednesday, February 27 2008
Cross-domain analytics provides great benefit to IT managers
who are frustrated with the common scenario of fixing a problem only to cause
additional unknown issues and other processes to fail.
It provides an ability to look across
what were siloed technology domains (servers, storage, applications, etc.),
understand the interactions between them, predict and pre-empt problems, and
make sure that business processes are operating optimally. This means that
service level objectives are easier to define and support, and that IT can
deliver on service level agreements with better performance and increased
reliability.
Analytics for the
Virtualized Data Center
Most major IT virtualization efforts are designed to create
shared pools of resources, greatly increasing "efficiency" from an investment
perspective. Virtualization also provides a shared reserve of available
capacity and a dynamic environment to flexibly handle changes on demand. But
the inherent abstraction that today's server and storage virtualization
technologies presents is a real challenge for IT management.
Cross-domain analytics enables IT to visualize a service's
allocated infrastructure across multiple virtualized IT domains. It provides IT the ability to understand-and
prove to the business-that resources buried several layers down in the IT stack
are contributing effectively to service performance. Cross-domain analytics are
needed so IT can determine whether resources are being efficiently and
optimally used by the business, and so IT can manage the virtualized data
center.
The Need for
Cross-Domain Analytics
Existing system management solutions can address the
business view of workloads and can measure end-user response times. This is
great for reaching maturity in service level management. There are also many
element management tools that enable IT to generate reports that show the
utilization of the CPUs on servers, and the disk space used, and how much
network bandwidth is available. With this information, can't the business then
simply hold IT accountable for good end-user response time and high
device-level utilization?
This approach was only practical when devices were dedicated
one-to-one with services so that we could treat all the resources for a service
as a dedicated system. When a business transaction is performed on today's
virtualized IT infrastructure, it travels across a web of IT domain-specific
service providers. IT now needs the ability to manage performance across
multiple domains, including the ability to obtain internal service metrics in
each domain on virtualized and shared resources.
New cross-domain solutions are emerging that help IT
generate these service metrics across virtualized domains. These new solutions
collect data from within each IT domain to peel back the virtualization
abstraction and gain visibility into the actual resources assigned to each
service. IT gains the ability to model the queuing behavior across both
physical and virtualized domains to provide the necessary management insight
into how efficiently enterprise resources are being utilized. There is now an
efficient way to measure internal IT performance that can make sense to the business.
Three Basic Metrics
for Measuring Performance
Organizations can start by examining the three basic
performance metrics that describe a system where the system is treated as a
single "box". Since we are concerned with IT's performance in service delivery,
our basic metrics are:
- Application
Workload
- Response
Time
- Utilization
Application workload refers to the load or the demand level
placed on the system by users. In a stable system this is also equal to the
throughput, and it is usually described by the business in terms of business
transactions. IT will need to put some effort into translating a business
transaction into units of work that are executed within the IT infrastructure,
but this is often addressed through established capacity planning and
chargeback methodologies.
Response time is the primary measurement of performance and
it measures the time each transaction takes to complete. End-user transactions
can be externally clocked in many ways, and this is often accomplished through
implementation of service level management solutions. Utilization measures the
effective busyness of the IT system that services the workload. Once
utilization reaches 100%, no more work can be performed. These three metrics
are related by queuing theory, which in a nutshell states that the more work
going through a system, the busier it gets linearly, but the response time gets
worse non-linearly.
Therefore, if IT only cared about maximizing throughput, we
could drive enough work to make the system 100% busy. But if the enterprise
also cares about performance service levels, we have to do some queuing math to
understand how much work the system can perform before it slows down.
Managing the
Virtualized Data Center
As an IT system is decomposed into physical and virtual
management domains, the performance metrics described above can now be
generated at each service layer:
- Transaction
Workload
- Internal
Response Time
- Effective
Utilization
The Transaction Workload refers to the amount of resource
required from each domain to service a customer's request. The Internal
Response Time is a measure of how long it takes for a transaction to complete
its work across a specified set of IT domains. If you manage IT infrastructure
that includes both server and storage domains, you might create an
Infrastructure Response Time metric that will serve as your primary service
measurement.
Effective Utilization is a measure of the physical and
virtual resources used to deliver a service. For example, a virtual server with
a specified "CPU limit" would be at 100% effective utilization at that limit. While
utilization measurements are traditionally used for resource-level capacity
planning, there are key derived scores and indices that can be built from the
new cross-domain performance models to help directly manage IT efficiency and
agility, both at the domain level and at the overall data center level.
IT Key Performance
Indicators
A well-built, cross-domain performance model first produces
an optimal operating goal for each resource that indicates the maximum
effective utilization while still ensuring solid performance. For each IT
service, IT management can then report on the following system performance
indicators:
- The Performance Index is a score
for how well a particular application's set of assigned resources are
being utilized compared to the optimum level. This index immediately shows
whether resources have remaining capacity, are being over-utilized or are
efficiently aligned to meet demand. The percent of time this index remains
in a favorable range can also be used as an indicator of system
performance reliability.
- System Efficiency is a Key
Performance Indicator (KPI) that tracks alignment of IT resources to application
requirements over time. Highly efficient systems allocate just enough
resource to meet current loads. Inefficient systems might be ripe for
consolidation or technology refresh initiatives.
- System Agility is a KPI that
demonstrates the variance in the alignment of IT resources to workload
over time. A high variance indicates low agility of the IT domains to
respond to changing workload, likely because of inflexibly dedicated
resources. Virtualized and dynamically re-balanced domains will have high
agility scores.

Figure 1. Performance Index enables IT to quickly
assess whether an application's set of assigned resources are optimally
utilized.
Cross-domain analytics can then be implemented to measure
the productive use of IT resources. Data
Center Efficiency and
Data Center Agility scores can be determined by rolling up System Efficiency
and System Agility scores from IT domains. Internal Response Time numbers can
be aggregated to produce a Data Center Effectiveness rating. If these KPIs are
carefully constructed to be on a 0 to 100 scale, then any business or IT
manager can easily determine the state of IT and evaluate how efficiently the
organization is leveraging IT resources. Data center performance metrics have
become useful in deriving KPIs for managing IT Infrastructure from a business
perspective. IT can now report service performance scores in:
- Effectiveness
in delivering service
- Efficiency
with respect to meeting performance requirements
- Agility
in responding dynamically to change
When IT has real numbers that can be taken back to the
business to show how they are operating their virtualized data center
infrastructure, they gain a tremendous amount of credibility. As IT and their
business folk negotiate with this kind of information between them, it becomes
possible to accurately:
- Assess
past performance
- Make
intelligent new IT investment decisions
- Set
realistic and measurable goals.
Utilizing KPI's to
Ensure Data Center Efficiency
With the right cross-domain performance management solution,
the data center KPIs mentioned above are automatically created for both
dedicated physical and dynamic virtualized architectures. This enables IT to
operationally manage infrastructure domains day-to-day to ensure operational
delivery and reliability.
Deviations from "normal" can be quickly alerted and acted
upon. Service support processes can be driven and managed "horizontally" across
the whole set of IT domains. Even more powerfully, IT management now has real
metrics that can trigger, guide and assess the results of projects designed to
optimize infrastructure. Poor efficiency scores can be used to initiate
consolidation efforts. Low agility scores can drive virtualization deployments.
While raising these scores, infrastructure response times
can be monitored to ensure that the end effective service delivery is
maintained or even improves. The models that produce these metrics can also be
used predicatively to recommend future scenarios. For example, if the business
is forecasting a growth campaign, IT can project what types of new investments
they will need, and what the various investment scenarios will mean to IT
service effectiveness and data center efficiency.
Technology refresh requirements, vendor lease negotiations
and even outsourcing alternatives can be fairly evaluated for impact. Service delivery
processes that help align IT with the business are now enabled with
mathematically-based decision-making information.
Leveraging Cross-domain
Analytics to Manage IT as a Business
The virtualized data center can now leverage cross-domain
analytics to manage virtualized server and storage infrastructure as a coherent
system. Cross-domain analytics allow IT to see across, as well as drill down,
to analyze what were previously siloed technology domains. The enterprise can
measure the interactions between domains and proactively avoid potential
problems.
IT can leverage cross-domain analytics to make sure that
business processes are operating smoothly, and that server and storage
resources are being efficiently utilized.
IT investment decisions and project evaluations can now be
made with mathematical certainty, and technical initiatives and efficiency
objectives can be measured for success. Organizations can manage IT as a
business, with intelligent, automatic KPIs that can justify, measure and validate
IT initiatives like server consolidation, virtualization or technology
upgrades.
Related Links:
Akorri , Cross-domain analytics , Autotrader.com Choses Akorri
Mike
Matchett leads Akorri's product marketing with over 17 years experience in IT
systems management. He joined Akorri from BGS Systems and BMC Software where he
most recently rolled out the PATROL Perceive product and BMC's Performance and
Capacity Planning Managed Services. Before BMC, Mike managed IT networking
projects for federal intelligence agencies. Previously Mike had been a USAF
officer serving in Desert Shield/Desert Storm. Mike can be reached at
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
|