EMC ECS VDC Nodes Test

Nodes are the key component of the EMC ECS. Nodes are the servers where EMC ECS processes are running which provide storage services. The purpose of this test is to look at the overall health of VDC by looking at the state of each node in VDC. For any VDC to function effectively the majority of nodes must be working optimally, bad nodes are replaced promptly and offline nodes provided maintenance. This test collects metrics around node health at the VDC level, if administrators find the overall node health deteriorating, they can drill down to node-level using looking at metrics from node test.

Target of the test : A Dell EMC Elastic Cloud Storage System

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each Virtual Data Centre (VDC)

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed .

Host

The host for which the test is to be configured. Since the storage device is managed using the IP address of its storage controller, the same will be displayed as host.

Port

The port number at which the specified host listens. By default, this is NULL.

ECS REST API Port

This is the port at which REST API connectivity is provided. By default, port 4443 is used.

Username and Password

To collect performance metrics from the target storage device, the eG agent should be configured with the credentials of a user who is vested with "read-only" privileges to access REST API of the target storage device. Specify the credentials of such a user in the Username and Password text boxes.

Confirm Password

Confirm the password by retyping it here.

Timeout Seconds

Specify the time duration for which this test should wait for a response from the storage system in the Timeout text box. By default, this is 60 seconds.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Is inactive state?

Indicates if the inactive state is enabled for nodes in the VDC.

 

The values that this measure can report and their corresponding numeric values are tabulated below:

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the status. However, in the graph of this measure, the same will be represented using the corresponding numeric equivalents only.

The detailed diagnosis of this measure provide additional details including VDC ID, VDC end points and Is permanently failed.

Number of nodes

Indicates the total number of nodes in this VDC.

Number

Majority of nodes should be online for proper and expected functioning of a VDC. These measures are good indicators of performance degradation because of nodes unavailability.

 

 

 

Good nodes

Indicates the total number of good nodes in this VDC.

Number

Bad nodes

Indicates the total number of bad nodes in this VDC.

Number

Maintenance nodes

Indicates the number of nodes that are under maintenance.

Number

Average CPU usage

Indicates the average percentage of CPU usage across all online nodes in this VDC.

Percentage

If the average CPU usage for VDC is near 100, it means the VDC is overloaded with most of the nodes either running at peak capacity or have failed. If high usage continues for longer period, the entire VDC may fail. If this number is high you should either start added more nodes to the system or divert traffic to other VDCs if available.

Average relative memory

Indicates the average percentage of memory usage against memory available across all nodes in this VDC.

Percentage

If the average memory usage is high, it could result in processes being staved of memory which may lead to delayed response from nodes and overall performance degradation of the system. If the number is high for longer periods, you should consider adding more nodes or adding more memory to the nodes.

 

Average memory usage

Indicates the average aggregate memory usage per node available on this VDC.

Percentage

Average NIC transmitted bandwidth

Indicates the average rate of data transmitted through NICs across the VDC.

MB/Sec

NIC bandwidth should be tracked for a period of time and compared with previous values and against the manufacturer provided range.

 

 

Average NIC received bandwidth

Indicates the average rate of data received through NICs across the VDC.

MB/Sec

Average NIC bandwidth

Indicates the average bandwidth of NIC hardware used across the VDC.

MB/Sec

Average NIC transmitted utilization

Indicates the average percentage of bandwidth utilized by NICs out of the available bandwidth across the VDC, for data transmission.

Percent

A persistent value near 100 may be an indicator that NICs are overloaded. System load should be reduced by either by adding more nodes or by diverting traffic to other VDCs if available.

 

Average NIC received utilization

Indicates the percentage of average bandwidth utilized by NICs out of the available bandwidth across the VDC, for data reception.

Percentage

Average NIC utilization

Indicates the percentage of average bandwidth utilized against available NIC bandwidth across the VDC.

Percentage