Solace Cluster Status Test
The HA Solace Cluster uses an Active/Standby model with an arbiter node (Monitoring Node) for split-brain detection. This requires three nodes each running the event broker:
-
Primary node
-
Backup node
-
Monitoring node
The primary and backup nodes both run the software event broker under the messaging node role, while the monitoring node runs it under the monitoring node role. When in operation, the messaging nodes will assume one of these Active/Standby roles: Primary or Backup. At any one time, one node is the primary and the other is the backup. Upon a failover, connections to the broker are switched over from the Primary to the Backup node automatically. In such cases, the backup event broker takes over the messaging activities until the primary event broker comes back online. Then the primary event broker takes over the standby role. However, if both primary and backup nodes fails at the same time, then solace cluster will not be able to operate. This failure can be due to reasons such as issues in network, port or SEMP connectivity. These connectivity issues can lead to node failures and eventually message or data loss. To ensure proper functioning of the nodes, it is imperative to keep vigil on the connectivity and activity state change of the primary and backup nodes in the cluster. The Solace Cluster StatusTest helps administrators in this regard.
This test monitors the target Solace cluster and reports the number and percentage of nodes with network, SEMP, and port connectivity. This test also indicates whether any failover occurred among the nodes through reporting the node activity change, and number of nodes in local-active and mate-active state. In addition, this test also helps you to identify whether any of the nodes deviates from there designated state. This way administrators are able to easily trace node failures or connectivity issues experienced by the nodes before it leads to any catastrophic outcomes.
Target of the test : A Solace Cluster
Agent deploying the test : An external agent
Outputs of the test : One set of results for the target cluster that is to be monitored.
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The IP address of the target host for which this test is to be configured. |
Port |
Refers to the port at which the Solace Cluster listens to. |
UserName, Password and Confirm Password |
The eG agent uses SEMP API to collect metrics from all the nodes in the Solace Cluster. In order to enable the eG agent to access SEMP API and collect metrics, a user with read only privilege has to be created on all the nodes in the cluster that requires monitoring. If such a user does not pre-exist, you have to manually create a user with aforesaid privileges, for that, refer to: Creating a New User for Monitoring Solace PubSub+ Event Broker. Specify the credentials of such a user against the User Name and Password parameters. Confirm the Password by retyping it in the Confirm Password text box. |
Total Cluster Nodes |
Provide a comma-separated list of both the primary and backup nodes in the cluster that requires monitoring on this text box. You should specify the nodes in the following format: HOSTNAME1:PORT1,HOSTNAME2:PORT2,... . For example, 172.16.8.233#8080,172.16.8.235#8080,.... |
Primary Nodes |
The eG agent needs to connect to the SEMP API on the primary node and run API commands to collect metrics. For this purpose, the eG agent has to be configured with the details of the primary node on this text box. You should specify the node details in the following format: HOSTNAME:PORT. For example, 172.16.8.233#8080. |
Include Network Connectivity |
By default this flag is set to Yes, indicating that this test will report the network connectivity of the cluster by default. But if you do not wish to monitor the network connectivity of the cluster through ping check, then set this flag to No. |
SSL |
By default, this flag is set to No indicating that the Solace Cluster is not SSL-enabled by default. Set this flag to Yes if the Solace Cluster is SSL-enabled. |
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement |
Description |
Measurement Unit |
Interpretation |
---|---|---|---|
Total nodes in the cluster |
Indicates the total number of nodes in the target cluster. |
Number |
|
Nodes with network connectivity |
Indicates the number of nodes with network connectivity in the target cluster. |
Number |
This measures is applicable only if the Include Network Connectivity parameter is set to Yes. |
Network connectivity to nodes |
Indicates the percentage of nodes with network connectivity on the target cluster. |
Percent |
This measures is applicable only if the Include Network Connectivity parameter is set to Yes. This measure reports 100% if all the nodes in the cluster have network connectivity. The value 0 for this measure could mean that all nodes in the cluster are either down or too busy, or the interconnecting network is down. The detailed diagnosis of this measure reports the Node details and the Status. |
Nodes with port connectivity |
Indicates the number of nodes with TCP Port connectivity on the target cluster. |
Number |
|
Port connectivity to nodes |
Indicates the percentage of nodes with TCP port connectivity on the target cluster. |
Percent |
This measure reports 100% if all the nodes in the cluster have port connectivity. The detailed diagnosis of this measure reports the Node details and the Status. |
Nodes with SEMP connectivity |
Indicates the number of nodes with SEMP connectivity on the target cluster. |
Number |
|
SEMP connectivity to nodes |
Indicates the percentage of nodes with SEMP connectivity on the target cluster. |
Percent |
This measure reports 100% if all the nodes in the cluster have SEMP connectivity. The detailed diagnosis of this measure reports the Node details and the Status. |
Nodes in local-active state |
Indicates the number of nodes in local-active state on this cluster. |
Number |
The primary node will be mostly in local-active state in the redundancy set up. Primary specifies that the event broker is acting as the primary virtual router in an active/standby redundancy model. The detailed diagnosis of this measure reports the Node details. |
Nodes in local-active state |
Indicates the percentage of nodes in local-active state on the target cluster. |
Percent |
This measure reports 50% if one out of two nodes is in local-active state. |
Nodes in mate-active state |
Indicates the number of nodes in mate-active state on this cluster. |
Number |
The backup node will be mostly in the mate active state in the redundancy set up. Backup specifies that the event broker is acting as the backup virtual router in an active/standby redundancy model. The detailed diagnosis of this measure reports the Node details. |
Nodes in mate-active state |
Indicates the percentage of nodes in mate-active state on the target cluster. |
Percent |
This measure reports 50% if one out of two nodes is in mate-active state. |
Nodes with recent activity state change |
Indicates the number of nodes experienced an activity change recently on the target cluster. |
Number |
This measure indicates any failover that happened recently which lead to activity state change of the nodes. |
Designated nodes not in local-active state |
Indicates the number of nodes that are currently not in designated local-active state on the target cluster. |
Number |
When in operation, the messaging nodes will assume either Primary or Backup roles. At any one time, one node is in the local-active and the other is the mate-active state. Upon a failover, connections to the broker are switched over from the Primary to the Backup node automatically. This measure also indicates if any failover happened on the cluster.
|
Designated nodes not in mate-active state |
Indicates the number of nodes that are currently not in designated mate-active state on the target cluster. |
Number |
The detailed diagnosis of Nodes with network connectivity reveal further details like Node IP, and Status.
Figure 1 : Detailed diagnosis of Nodes with network connectivity measure
The detailed diagnosis of Nodes with port connectivity reveal further details like Node IP, and Status.
Figure 2 : Detailed diagnosis of Nodes with port connectivity measure
The detailed diagnosis of Nodes with SEMP connectivity reveal further details like Node IP, and Status.
Figure 3 : Detailed diagnosis of Nodes with SEMP connectivity measure
The detailed diagnosis of Nodes in local-active state reveal further details on the Node.
Figure 4 : Detailed diagnosis of Nodes in local-active state measure
The detailed diagnosis of Nodes in mate-active state reveal further details on the Node.
Figure 5 : Detailed diagnosis of Nodes in mate-active state measure