RabbitMQ Cluster Status Test

Typically, at least one cluster node should be up and running at a given point in time, to keep the cluster alive. Client applications will be unable to access a RabbitMQ cluster if all its nodes are down. To ensure high cluster availability therefore, administrators should keep an eye on the running state of every node in the cluster and promptly identify the nodes that are not running, so that they can rapidly initiate measures to start the nodes that are down. This is where the RabbitMQ Cluster Status test helps!

This test monitors the status of nodes in a cluster, and reports the count of nodes that are running and those that are down. Also, the test promptly notifies administrators if even a single node is rendered unavailable. Detailed diagnostics provided by this test reveal the name of the unavailable node(s), thus enabling administrators to start that node(s) and ensuring cluster availability.

Target of the test : A RabbitMQ Cluster

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the cluster being monitored

Configurable parameters for the test
Parameters Description

Test period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port at which the configured Host listens; by default, this is 15672

Username, Password, and Confirm Password

The eG agent connects to the Management Interface of the rabbitmq-management plugin of the target node, and runs HTTP-based API commands on the node using the plugin to pull metrics of interest. To connect to the plugin and run the API commands, the eG agent requires the privileges of a user on the cluster who has been assigned the 'monitoring' tag. If such a user pre-exists, then configure this test with the Username and Password of that user. On the other hand, if no such user exists, then you will have to create a user for this purpose using the Management Interface. The steps for this have been detailed in How Does eG Enterprise Monitor a RabbitMQ Cluster? In this case, make sure you configure this test with the Username and Password of the new user. Finally, confirm the password by retyping it in the Confirm Password text box.

SSL

By default, this flag is set to No, as the target node is not SSL-enabled by default. If the node is SSL-enabled, then set this flag to Yes.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Total nodes

Indicates the total number of nodes in the cluster.

Number

 

Running nodes

Indicates the number of nodes in the cluster that are currently running.

Number

Ideally, the value of this measure should be equal to that of the Total nodes measure.

Stopped nodes

Indicates the number of nodes in the cluster that are not running presently.

Number

Ideally, the value of this measure should be 0. If this measure reports a non-zero value, then use the detailed diagnosis of the measure to know which nodes in the cluster are not running.