Redis Cluster Failover Test

High availability in Redis is achieved through master-replica replication.

A master Redis server can have multiple Redis servers as replicas, preferably deployed on different nodes across multiple data centers. When the master is unavailable, failover occurs and one of the replicas can be promoted to become the new master and continue to serve data with little or no interruption.

However, in case of a cluster with many number of nodes, it is laborious for the administrators to find out whether/not failover happened and what is current role status of each of the nodes. Redis Cluster Failover Test helps administrators with the same. This test continuously monitors all the nodes in the cluster and promptly alert administrators if failover occurred. In addition, the administrators can identify the details of changed roles of the nodes in the cluster using the detailed diagnosis.

Target of the test :A Redis Cluster

Agent deploying the test : An external agent

Outputs of the test : One set of results for the target Redis cluster being monitored.

Configurable parameters for the test
Parameters Description

Test period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port at which the specified HOST listens.

Username

This parameter is applicable only if the target server is Redis v6 or above. The eG agent has to be configured with the credentials of a user who has Read-only privileges on the monitored Redis Cluster Server. To create a user named eguser,run the following command.

acl setuser <username> on ><passowrd> allcommands allkeys

For example, to create an eguser with read only privilege run the following command:acl setuser eguser allkeys -@all +client|list +cluster|info +cluster|nodes +config|get +info +memory|usage +ping +scan +slowlog|get +time +ttl +xinfo|groups +xinfo|stream resetchannels on >password

Redis Password and Confirm Password

If the target server is Redis v6 or above, then specify the password that corresponds to the above-mentioned Username in this text box.

In some high security environments, a password may have been set for the Redis server(before v6), so as to protect it from unauthorized accesses/abuse. If such a password has been set for the monitored Redis server, then specify that password against REDIS PASSWORD. Then, confirm the password by retyping it against CONFIRM PASSWORD.

If the Redis server is not password protected, then do not disturb the default setting of this parameter.

To determine whether/not the target Redis server is password-protected, do the following:

  • Login to the system hosting the Redis server.

  • Open the redis.conf file in the <REDIS_INSTALL_DIR>.

  • Look for the requirepass parameter in the file.

  • If this parameter exists, and is not preceded by a # (hash) symbol, it means that password protection is enabled for the Redis server. In this case, the string that follows the requirepass parameter is the password of the Redis server. For instance, say that the requirepass specification reads as follows:

    requirepass red1spr0

    According to this specification, the Redis server is protected using the password red1spr0. In this case therefore, you need to specify red1spr0 against REDIS PASSWORD.

  • On the other hand, if the requirepass parameter is prefixed by the # (hash) symbol as shown below, it means password protection is disabled.

    # requirepass red1spr0

    In this case, leave the REDIS PASSWORD parameter with its default setting.

SSL

By default, the SSL flag is set to No, indicating that the target Redis cluster server is not SSL-enabled by default. To enable the test to connect to an SSL-enabled Redis cluster server, set the SSL flag to Yes.

Cluster Nodes

By default, the Cluster Nodes parameter is set to auto-discover, indicating that, by default, this test will auto-discover the nodes available in the cluster and report metrics for all the discovered nodes. However, in case of node failures the eG agent needs to connect to any of the available nodes and collect metrics. To this effect, provide a comma-separated list of nodes in this text box. You should specify the nodes in the following format: HOSTNAME1#PORT1,HOSTNAME2#PORT2,... . For example, 172.16.8.81#30071,172.16.8.81#30072,....

Master Slave Group

Often, administrators find it difficult to identify the problematic nodes at a single glance in a cluster, in environments where hundreds of nodes are available. To easily identify the problematic nodes, administrators are allowed to group the nodes into master and its respective replicas under a group name (prefix). To this effect, the Master Slave Group flag is set to Yes, by default indicating that this test will report metrics for each group name (prefix):node name combination i.e., the descriptor of this test will be group name (prefix):node name. If you do not wish to group the nodes, then, set this flag to No in which case, the descriptor of the test will be the nodes.

Master Slave Group Prefix

The Master Slave Group Prefix parameter is applicable only if the Master Slave Group flag is set to YES. In this text box, specify the name of the prefix under which the nodes should be grouped. For example, if the nodes are to be grouped with a prefix Shard, then, the descriptors of this test will be displayed in the following format: Shard1:<comma-separated list of node names>, Shard2: <comma-separated list of node names>, ...

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Is failover happened?

Indicates whether/not failover happened on this cluster.

 

The numeric values that correspond to these measure values are discussed in the table below:

Measure Value Numeric Value
Yes 1
No 0

Note:

This measure reports the Measure Values listed in the table above to indicate whether/not failover happened on the cluster. However, in the graph, this measure is indicated using the Numeric Values listed in the table above.

Use the detailed diagnosis to get more details on the role status if the value reported by this measure is YES.