EMC PowerVault ME Controller Status Test

The storage controller is essentially a server that’s responsible for performing a wide range of functions for the storage system. Each controller has an I/O path to communicate to the storage network or the directly-attached servers, an I/O path that communicates to the attached storage devices or shelves of devices, and a processor that handles the movement of data as well as other data-related functions, such as RAID and volume management. In the modern data center, the performance of the storage system can be directly impacted (and in many cases determined) by the overall health of the storage controller. In single-controller configurations, if the storage controller crashes, then the storage system as a whole will become inaccessible to users. This is why, it is good practice to go for dual-controller configurations. A dual-controller configuration improves application availability because in the unlikely event of a controller failure, the affected controller fails over to the surviving controller with little interruption to the flow of data. However, since fail over occurs automatically upon the failure of a controller, administrators may not even know why the primary controller failed or whether the secondary has taken over or not! This is when the EMC PowerVault ME Controllers Status test comes in handy!

This test not only monitors the status of each controller in the EMC PowerVault ME storage system, but also promptly reports controller failures and the reason for the failure. In the process, the test also indicates whether the primary controller has failed over to the secondary or not.

Target of the test : A EMC PowerVault ME storage system

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each controller of the EMC PowerVault ME storage system being monitored.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed .

Host

The host for which the test is to be configured. Since the storage device is managed using the IP address of its storage controller, the same will be displayed as host. In case of a dual-controller configuration, the IP address of the primary controller will be displayed here.

Port

The port number at which the specified host listens. By default, this is NULL.

Additional Controller IP

By default, this test always connects to the Host to collect metrics. If the Host is unavailable, then the test will not be able to execute. This is because, the Additional Controller IP is set to none by default.

If the monitored storage device has two controllers, then you can configure the test to connect to an alternate controller, if the host is unreachable. For this purpose, specify the IP address of the alternate controller in the Additional Controller IP text box.

User and Password

In order to monitor a EMC PowerVault ME storage system, the eG agent has to be configured with the credentials of a user who has been assigned the Monitor role. Specify the login credentials of such a user in the User and Password text boxes. To know how to create such a user, refer to Pre-requisites for monitoring the EMC PowerVault ME storage system.

Confirm Password

Confirm the password by retyping it here.

ServicePort

The Management Controller of the EMC PowerVault ME storage system provides access for monitoring and management via the HTTP and HTTPS protocols for XML API request/response semantics. To enable the eG agent to access the management controller, invoke the XML API commands, and collect the required metrics, you need to specify the service port on the controller that listens for HTTP/HTTPS requests for XML API semantics. By default, this is port 80.

Timeout

Specify the time duration for which this test should wait for a response from the storage system in the Timeout text box. By default, this is 60 seconds.

SSL

By default, EMC PowerVault ME system is not SSL-enabled. This is why, this flag is set to False by default. If it is SSL-enabled, then change this flag to True.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Status

Indicates the current operational status of this controller.

 

The values that this measure can report and their corresponding numeric values are tabulated below:  

Measure Value Numeric Value
Down 0
Unknown 1
Operational 2
Not installed 3

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the operational status of a controller. However, in the graph of this measure, controller status will be represented using the corresponding numeric equivalents only.

Failed over

Indicates whether this controller has failed over to the partner controller i.e., the secondary controller (in a redundant setup).

 

The values that this measure can report and their corresponding numeric values are tabulated below:

Measure Value Numeric Value
Yes 0
No 1

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the failed over status of this controller. However, the graph of this measure will be represented using the corresponding numeric equivalents of the Measure Values as mentioned in the table above.

The detailed diagnosis capability of this measure if enabled, lists the time, the name of the controller and the reason for the fail over of the controller.

Health

Indicates the overall health of this controller.

 

The values that this measure can report and their corresponding numeric values are tabulated below:

 

Measure Value Numeric Value
Fault 0
OK 1
Unknown 2

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the overall health of this controller. However, in the graph of this measure, controller health will be represented using the corresponding numeric equivalents only.

The detailed diagnosis capability of this measure if enabled, lists the time, the name of the controller and the reason for the overall health of the controller.