XIO Storage Controllers Test

This test auto-discovers the storage controllers in the target EMC XtremIO Storage array and reports where/not each storage controller is enabled. Using this test, administrators can figure out the current health of the storage controller and the components with the storage controller such as fan, Field Replaceable unit, internal sensor, management port etc. This test also helps administrators determine the journaling state and the RAM usage level of each storage controller. The temperature and voltage deviation of each storage controller can also be easily detected and rectified.

Target of the test : An EMC XtremIO Storage array

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each Storage Controller on the target EMC XtremIO being monitored

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the storage device for which this test is to be configured.

Port

The port number at which the storage array listens. The default is NULL.

XtremIO User and XtremIO Password

Provide the credentials of a user who has read only privileges to access the XtremIO storage array in the XtremIO User and XtremIO Password text boxes.

Confirm Password

Confirm the password by retyping it here.

XMS IP

This parameter is applicable only for EMC XtremIO 4.x. By default, None will be chosen from this list. If the target EMC XtremIO storage array is within a XMS Management Server that is auto-discovered, then the IP or host name of that XMS Management Server will be displayed in this list. Select that particular XMS IP to configure this test. If you wish to monitor an EMC XtremIO Storage Array that is either not an integral part of the auto-discovered XMS Management Server or a brand new EMC XtremIO Storage Array, choose the Other option. This will enable you to add a new XMS Managament Server. To know how to add a new XMS Management Server, refer to Adding a new XMS.

SSL

The eG agent collects performance metrics by invoking Restful APIs on the target Storage array. Typically, the Restful APIs can be invoked through the HTTP or the HTTPS mode. By default, the eG agent invokes the Restful APIs using the HTTPS mode. This is why, the SSL flag is set to Yes by default. If the target storage array is not SSL-enabled, then the Restful APIs can be accessed through the HTTP mode only. In this case, set the SSL flag to No.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 2:1. This indicates that, by default, detailed measures will be generated every second time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Storage Controller’s current health state

Indicates the current health of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current health of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

Is enabled?

Indicates whether/not this Storage Controller is enabled.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Yes 0
User_disabled 1
System_disabled 2

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether/not the Storage Controller is enabled. The graph of this measure however is represented using the numeric equivalents only - 0 to 2.

Fan health state

Indicates the current health of the fan sensor types (both analog and discrete sensors) of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current health of the fan sensor types of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

Health status

Indicates the current health of the Field Replaceable Units of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Healthy 0
Initializing 1
Uninitialized 2
Failed 3
Disconnected 4

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current health of the Field Replaceable Units of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 4.

Internal sensor health state

Indicates the current health state of the temperature/fan/voltage/current/internal sensor types(both analog and discrete sensors) of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current health of all the sensor types of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

iSCSI daemon state

Indicates the state of the iSCSI deamon of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Healthy 0
Failed 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the state of the iSCSI deamon of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 or 1.

Journal state

Indicates the health of the journals during failover and failback on this Storage Controller.

 

Every piece of information that is not committed to the SSD is kept in multiple locations, called Journals. Each software module has its own Journal, which is not kept on the same Storage Controller, and can be used to restore data in case of unexpected failure. Journals are regarded as highly important and are always kept on Storage Controllers with battery backed up power supplies. In case of a problem with the Battery Backup Unit, the Journal fails over to another Storage Controller. In case of global power failure, the Battery Backup Units ensure that all Journals are written to vault drives in the Storage Controllers and the system is turned off.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Healthy 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the journal health state regarding failover and failback of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only i.e., 0.

RAM usage level

Indicates the health of the RAM low level indicator of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Healthy 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the health of the RAM low level indicator of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only i.e., 0.

Management link health state

Indicates the current health of the management port of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current health of the management port of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

Remote journaling health state

Indicates the health of the remote journal of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Healthy 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the health of the remote journal of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only i.e., 0.

Temperature health state

Indicates the health of the temperature sensor types (both analog and discrete sensors) of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the health of the temperature sensor types of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

Voltage health state

Indicates the health of the voltage sensor types (both analog and discrete sensors) of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Level_1_clear 0
Level_2_unknown 1
Level_3_warning 2
Level_4_minor 3
Level_5_major 4
Level_6_critical 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the health of the voltage sensor types of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 to 5.

Management port state

Indicates the current state of the management port of this Storage Controller.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure value Numeric Value
Up 0
Down 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of the management port of this Storage Controller. The graph of this measure however is represented using the numeric equivalents only - 0 or 1.