Solace Cluster Health Test

Any unavailability of the primary or backup nodes can cause prolonged outages leading to customer dissatisfaction. If the health of Redundancy, ConfigSync and Message spool of the cluster is degraded or down, it can lead to configurations getting out of sync between the nodes and eventually leading to replication issues and message loss. This in turn affects the fault tolerance and overall service availability. This is where Solace Cluster Health Test comes in handy.

This test monitors the target Solace cluster and reports the availability of nodes in the cluster. In addition this test also monitors the health of Redundancy, ConfigSync and Message spool of the cluster and reports if the health is down or degraded. This way administrators can easily identify any faulty nodes, activity failover, or synchronization issues and rectify the same before it has a serious impact on the service and user experience.

Target of the test : A Solace Cluster

Agent deploying the test : An external agent

Outputs of the test : One set of results for the target cluster that is to be monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the target host for which this test is to be configured.

Port

Refers to the port at which the Solace Cluster listens to.

UserName, Password and Confirm Password

The eG agent uses SEMP API to collect metrics from all the nodes in the Solace Cluster. In order to enable the eG agent to access SEMP API and collect metrics, a user with read only privilege has to be created on all the nodes in the cluster that requires monitoring. If such a user does not pre-exist, you have to manually create a user with aforesaid privileges, for that, refer to: Creating a New User for Monitoring Solace PubSub+ Event Broker.

Specify the credentials of such a user against the User Name and Password parameters. Confirm the Password by retyping it in the Confirm Password text box.

Total Cluster Nodes

Provide a comma-separated list of both the primary and backup nodes in the cluster that requires monitoring on this text box. You should specify the nodes in the following format: HOSTNAME1:PORT1,HOSTNAME2:PORT2,... . For example, 172.16.8.233#8080,172.16.8.235#8080,....

Primary Nodes

The eG agent needs to connect to the SEMP API on the primary node and run API commands to collect metrics. For this purpose, the eG agent has to be configured with the details of the primary node on this text box. You should specify the node details in the following format: HOSTNAME:PORT. For example, 172.16.8.233#8080.

SSL

By default, this flag is set to No indicating that the Solace Cluster is not SSL-enabled by default. Set this flag to Yes if the Solace Cluster is SSL-enabled.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Node availability

Indicates the availability of nodes in the target cluster.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Not available 0
Redundant node not available 1
Available 2
Unknown 3

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the availability of nodes in the target cluster. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 3.

Redundancy health

Indicates the current redundancy health of the target cluster.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Down 0
Degraded 1
Failedover 2
Healthy 3
Unknown 4

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the redundancy health of the target cluster. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 4.

Use the detailed diagnosis of this measure to find out the Primary redundancy health and Backup redundancy health.

Messagespool health

Indicates the current health of the Message spool in the target cluster.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Down 0
Degraded 1
Failedover 2
Healthy 3
Unknown 4

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the Message spool health of the target cluster. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 4.

Use the detailed diagnosis of this measure to identify the Primary Message spool status and Backup Message spool status.

ConfigSync health

Indicates the current health of ConfigSync in the target cluster.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Down 0
Disabled 1
Degraded 2
Healthy 3
Unknown 4

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the ConfigSync health of the target cluster. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 4.

Use the detailed diagnosis of this measure to identify the Primary configsync status, and Backup configsync status.

The detailed diagnosis of Redundancy health reveal further details like the Primary redundancy health and Backup redundancy health.

Figure 1 : Detailed diagnosis of Redundancy health measure

The detailed diagnosis of Messagespool health reveal further details like the Primary Messagespool status and Backup Messagespool status.

Figure 2 : Detailed diagnosis of Messagespool health measure

The detailed diagnosis of ConfigSync health reveal further details like the Primary configsync status, and Backup configsync status.

Figure 3 : Detailed diagnosis of ConfigSync health measure