February 28, 2013 - Infrastructure management has never been easy, and neither virtualization nor the cloud have helped matters any. Non-physical environments are so dynamic and so complex that humans can no longer handle many of the provisioning and configuration tasks they once performed. For these, only a fully automated management stack will do. But even then, this hasn’t absolved the human brain from all responsibility. As eG Innovations’ Srinivas Ramanathan points out, it only means we will have to contend with higher-order tasks.
Cole: It's been apparent for some time that physical, virtual and cloud environments pose radically different management challenges, even when together they comprise a single, integrated data environment. How should the enterprise approach this trinity without creating a lot of redundancy within the management stack?
Ramanathan: Traditionally, enterprises have taken a very silo-centric approach to management. However, virtualization is not just another technology used by the enterprise. In fact, virtualization fundamentally changes many of the ground rules that management software has used in the past. For instance, traditional management software has been designed with the assumption that applications are static “ they run on the same system all the time. However, unlike in physical infrastructures, an application and its virtual machine can move dynamically from one physical machine to another as a result of Live Migration and other technologies supported by the virtualization platform. Virtualization also introduces new types of dependencies. For example, VMs that run on the same physical host share the resources of the physical server. One malfunctioning VM on the physical host can impact the performance of the other VMs on the same host. Management systems need to be aware of this new reality.
For this reason, enterprises will need to look for truly virtualization-aware management systems. To enable rapid diagnosis of problems and to ensure user satisfaction and productivity, the management system needs to be able to collect metrics from every layer of every tier, analyze and correlate the metrics using different types of inter-dependencies - application to VM, VM to physical host - and determine exactly what needs to be done to resolve a problem. This way, when a user complains that the service is slow, administrators can easily determine where the cause of the problem lies “ is it the network? The database? Application? Virtualization platform? Storage?
Cloud computing poses a different set of demands on management systems. First, cloud environments have different domains of control. The cloud instance is the responsibility of the cloud provider, but the application is managed by the business operations teams. The management system should be able to clearly demarcate which domain is causing a problem. Secondly, the essence of cloud computing is automation and agility. Management systems have to be designed to support automation and agility as well. For example, the management software should have the ability to be provisioned without requiring any human intervention, so a cloud instances can be created on-demand and the management software can also be set up automatically.
Management systems designed to handle virtualization and cloud computing can handle physical infrastructures as well. Unfortunately, the converse is not true. Because of the diverse, new requirements that virtualization and cloud computing pose, it is not a matter of retrofitting an existing management system for these new requirements.
Cole: eG Innovations specializes in intelligent, virtualization-aware management solutions. How will these attributes come into play in dynamic, highly scalable environments?
Ramanathan: The more dynamic the environment, the harder it is to manage with manual processes and expertise. In dynamic environments, inter-dependencies can change in real time. A VM and the applications it hosts could move from one physical server to another in the cluster. As a result, inter-dependencies can change from time to time. Traditional management systems that rely on manually defined, static if-then-else rules for correlating alerts and determining root-cause information are inadequate for such infrastructures.
Virtualized infrastructures include several additional tiers, and a management system for virtualized infrastructures can collect thousands of metrics for each server or tier. The traditional approach of manually analyzing hundreds of thousands of metrics across dozens of domains is no longer scalable and manageable by humans. The sheer complexity and amount of often contradictory data renders human intervention inadequate and reactive. At the same time, IT is more critical to business processes than ever before and any additional minute of downtime frustrates customers and costs companies tens of thousands of dollars.
Therefore, intelligent management systems are required to automatically analyze hundreds of thousands of metrics, determine what the norms of these metrics are and alert administrators proactively whenever any of the metrics violate its normal baseline. By automatically detecting baselines for thousands of metrics, these systems can save IT operations teams a lot of time and effort during configuration of the management system. Once alerts are generated, the next challenge is to correlate between these alerts and filter out the root-cause alerts from the effects. In this way, operations teams can focus on the root cause of problems, rather than become distracted by the effects of problems.
Intelligent management systems like our eG Enterprise software are capable of correlating between alerts automatically. To be able to do so, eG Enterprise auto-discovers the infrastructure in real time and detects several types of inter-dependencies. For instance, it can detect VM-to-physical-machine dependencies, application-to-VM dependencies and inter-application dependencies. Using this information and knowing which alert belongs to which tier, an intelligent management system can help determine where the root cause of a problem lies. Only through automatic correlation will IT administrators have the actionable insight to fine-tune their environment and accelerate diagnoses to the point where potential problems are fixed before users call and complain about slow apps.
Cole: As enterprises become more cloud-like, they will have to contend with new concepts like multi-tenancy, service-based provisioning and metered usage. How can the management system simplify these processes?
Ramanathan: Enterprise infrastructures are indeed very similar to service provider environments. For instance, most large enterprises have different domains of control - servers may be operated by a systems team, but the applications are the responsibility of the business teams. Furthermore, the same infrastructure could support different business units and hence, multi-tenancy is an important concept that must be addressed by the management system. Metering is also a necessity since the cost of the infrastructure and its operation has to be apportioned among different business units sharing the infrastructure.
Management systems must have native support for multi-tenancy. The management system should provide users personalized views, so when a user logs in to the management system, he or she can see the services, applications or servers they are responsible for. Role-based access is also often necessary since different stakeholders have different needs.
Automated provisioning is also a requirement, particularly with the widespread adoption of cloud computing. When a cloud instance is automatically provisioned, it is necessary to install the management software agent and configure the management system console automatically so the appropriate applications are managed, and then create the necessary user views. To enable this, the management system must support a programmable API or a command line interface that can be used by orchestration systems to automate the provisioning of the systems, applications and the management software.
To support metering, management systems must collect usage information at a great degree of granularity. This information forms the basis for metering and reporting on usage.