EMC ECS Replication Groups Test

The replication group is used by ECS for replicating data to other sites so that the data is protected and can be accessed from other, active sites. When you create a bucket, you specify the replication group it is in. ECS ensures that the bucket and the objects in the bucket are replicated to all the sites in the replication group.

Given that data is the most important asset for any organization, data must be safe from any of the failures like disk, node, or the entire site.

The purpose of this test is to determine the status of the replication group, the number of zones where data is replicated, rate t which data is transferred, and the amount of data still to be sent. Any deviation for normal could indicate a problem and administrators can take corrective action.

Target of the test : A Dell EMC Elastic Cloud Storage System

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each replication group in EMC Elastic Cloud Storage

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed .

Host

The host for which the test is to be configured. Since the storage device is managed using the IP address of its storage controller, the same will be displayed as host.

Port

The port number at which the specified host listens. By default, this is NULL.

ECS REST API Port

This is the port at which REST API connectivity is provided. By default, port 4443 is used.

Username and Password

To collect performance metrics from the target storage device, the eG agent should be configured with the credentials of a user who is vested with "read-only" privileges to access REST API of the target storage device. Specify the credentials of such a user in the Username and Password text boxes.

Confirm Password

Confirm the password by retyping it here.

Timeout Seconds

Specify the time duration for which this test should wait for a response from the storage system in the Timeout text box. By default, this is 60 seconds.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Is inactive state?

Indicates if the replication group is in the inactive state.

 

The health status of 0 indicates the node is offline. Usually this means either the node has failed or has been taken offline by user. In case the node is failed, you may want to start the recovery process.

The values that this measure can report and their corresponding numeric values are tabulated below:

Measure Value Numeric Value
No 0
Yes 1

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the status. However, in the graph of this measure, the same will be represented using the corresponding numeric equivalents only.

The detailed diagnosis of this measure provides additional information including Replication Group Id, description and if IS Allow Namespace is true.

Number of Zones

Indicates the number of zones or physical geographical sites the given replication group spans.

Number

 

Replication data sent

Indicates the rate at which replication data is sent across to other storage pools within the same replication group.

MB/Sec

Rate at which data is sent and received should be optimal. Any drop in rate may be caused by network issue or problem with node processes. Any persistent drop in rate should be investigated.

 

Replication data received

Indicates the rate at which replication data is received from other storage pools within the same replication group.

MB/Sec

Pending user data

Indicates the amount of user data that is pending replication.

GB

 

Pending System metadata

Indicates the amount of system metadata that is pending replication.

GB

 

Pending XOR data

Indicates the amount of XOR encrypted data that is pending replication.