Solace Redundancy Status Test

Solace PubSub+ appliances can operate in high-availability (HA) redundant pairs for fault tolerance. Redundancy provides 1:1 appliance sparing to increase overall service availability. HA redundancy eliminates the potential for a single point of failure by allowing a network administrator to define two appliances as a redundant pair. If one of the appliances is taken out of service or fails, the other appliance automatically takes over responsibility for the clients typically served by the out-of-service appliance.

The redundancy feature is largely transparent to clients and other appliances in the network. Only the two appliances that are paired as mates require explicit configuration to take advantage of the feature. To support redundancy, each appliance uses a primary and backup virtual router. To enable the backup virtual router to assume the role of its mate’s primary virtual router when a failure occurs, the configuration of the virtual routers on each appliance must mirror one another. That is, the backup virtual routers must have the same configuration as the primary virtual routers they backup.

For an active/standby redundant pair, the primary virtual router is on the primary appliance, and the backup virtual router is on the standby appliance. If the primary appliance goes out of service, the backup virtual router of the standby appliance changes to an active state, and it provides service for clients and handles the data and messages that typically use the primary virtual router of the primary appliance that has gone out of service.

For an active/active redundant pair, the primary virtual routers on both appliances are active, but the backup virtual routers are idle. If one of the appliances in the redundant pair goes out of service, the backup virtual router of the inactive appliance changes to an active state, and it provides service for clients and handles the data and messages that typically use the primary virtual router of the appliance that is out of service.

If the target Solace PubSub+ Event Broker in a redundant HA pair is down or if the redundancy configuration of the target broker is in shutdown state, then, the target broker will be unable to handle data and messages. This may lead to the non-delivery of messages to the clients which will affect the business delivery cycle. Also, administrators have to quickly identify whether the role of the target broker had changed from primary virtual router to backup virtual router and vice versa, and identify where exactly the status of the primary virtual router/backup virtual router had faulted - is it the messagespool? or ADB? or flash memory module? or power module? or the routing interface? The Solace Redundancy Status test helps administrators quickly identify the pain-points encountered by the target broker.

Using this test, administrators can figure out the redundancy state, configuration state and the role of the target Solace PubSub+ Event Broker. This test also throws light on when exactly the role of the target broker changed from primary virtual router to backup virtual router and vice versa. This test also help administrators identify the exact module on the primary virtual router/backup virtual router that had faulted and was not ready - is it the messagespool? or ADB? or flash memory module? or power module? or the routing interface?

Target of the test : A Solace PubSub+ Event Broker

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the target Solace PubSub+ Event Broker that is to be monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the target host for which this test is to be configured.

Port

Refers to the port at which the Solace PubSub+ Event Broker listens to.

UserName and Password

By default, the eG agent executes SEMP ( Solace Element Management Protocol) APIs on the target broker to collect the required metrics. For the eG agent to execute the SEMP APIs, a special user with read only privilege is required. Specify the credentials of such a user in the UserName and Password text boxes. To know how to create such a user, refer to Creating a New User for Monitoring Solace PubSub+ Event Broker.

Confirm Password

Confirm the Password by retyping it in the Confirm Password text box.

SSL

By default, this flag is set to No indicating that the Solace PubSub+ Event Broker is not SSL-enabled by default. Set this flag to Yes if the Solace PubSub+ Event Broker is SSL-enabled.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Configuration status

Indicates the current state of the redundancy configuration.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Shutdown 0
Released 1
Enabled 2
Enabled-Released 3

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of the redundancy configuration. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 3.

The detailed diagnosis of this measure lists the name of the mate router, the operation mode, switchover mechanism of the broker, the redundancy mode and the failover criteria of the target broker.

Redundancy status

Indicates the current redundancy status of the target event broker.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Down 0
Up 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current redundancy status of the target event broker. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

The detailed diagnosis of this measure lists the time at which redundancy failed at the last instance and the reason for the failure.

Is auto revert enabled?

Indicates whether/not auto revert option is enabled.

 

The auto-revert option controls what happens when the primary appliance comes back online after a failover has occurred. When auto-revert is not enabled (which is the default and recommended state), the primary appliance stays as a standby after it comes back online, allowing the backup appliance to remain active. In this case, the primary appliance becomes active only if the backup appliance fails or gives up activity.

If auto-revert is enabled, as soon as the primary appliance comes back online, it becomes active and switches the backup appliance from active to standby.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
No 0
Yes 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current redundancy status of the target event broker. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Active standby role

Indicates the role of the target event broker in an Active /Standby redundant pair.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Primary 1
Backup 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the Active-Standby role of the target event broker. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Has virtual router activity state changed?

Indicates whether/not the state of the target event broker has changed from primary virtual router to backup virtual router and vice versa.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Yes 1
No 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether/not the state of the target event broker has changed. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

The detailed diagnosis of this measure lists the name of the virtual router, the previous state of the virtual router and the current state of the virtual router.

ADB link state

Indicates whether/not the ADB link to mate is connected from the target event broker.

 

An Assured Delivery Blade (ADB) is a card in a Solace appliance that enables guaranteed delivery of messages. ADBs have non-volatile memory where critical data-structures are stored and mirrored to an HA mate.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Yes 1
No 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether/not the ADB link is connected from the target event broker. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

The detailed diagnosis of this measure lists the time at which the ADB link failed at the last instance and the reason for the failure.

ADB hello state

Indicates whether/not the ADB hello message was received by the target event broker.

 

ADB hello refers to a basic interaction between an application and the Solace message broker to send a simple hello message.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Yes 1
No 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether/not the ADB hello message was received by the target event broker. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

The detailed diagnosis of this measure lists the time at which the ADB hello message failed at the last instance and the reason for the failure.

ADB hello avg latency

Indicates the average time taken by the target event broker to receive the ADB hello message.

Milliseconds

 

ADB hello max latency

Indicates the maximum time taken by the target event broker to receive the ADB hello message.

Milliseconds

 

Primary activity

Indicates the current status of the target event broker if the broker is the primary virtual router in a redundant setup.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Shutdown 0
Subscriptions Pending 1
Local Inactive 2
Local Active 3
Master Active 4
Mate Active 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate current status of the target event broker as the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 5.

The detailed diagnosis of this measure lists the name of the VRRP interface, the VRRP address, the VRRP interface role and the routing interface.

Routing interface

Indicates the state of the routing interface connecting the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Up 1
Down 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate state of the routing interface connecting the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Messagespool status

Indicates the current status of the message spool in the primary virtual router that is to provide Guaranteed Messaging.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
AD-Disable 0
AD-Not Ready 1
AD-Standby 2
AD-Active 3
AD-Activating 4
Unknown 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the message spool in the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 5.

SMRP status

Indicates the current status of the Subscription Management Routing Protocol (SMRP) on the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the Subscription Management Routing Protocol (SMRP) on the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

ADM card status

Indicates the current status of the ADB on the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the ADB on the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

ADM datapath status

Indicates the current status of the ADB datapath of the primary virtual router.

 

This measure is a good indicator to figure out whether the ADB datapath of the primary virtual router is able to spool messages.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the ADB datapath of the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Flash module status

Indicates the current status of the Flash Memory Module on the ADB linked to the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the Flash Memory Module on the ADB linked to the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Power module status

Indicates the current status of the power module on the ADB linked to the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the power module on the ADB linked to the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

ADM content status

Indicates the current status of the contents of the ADB linked to the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the contents of the ADB linked to the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Disk status

Indicates the current status of the external disk array of the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the external disk array of the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Disk content status

Indicates the current status of the spool file directory on the external disk storage array of the primary virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the spool file directory on the external disk storage array of the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

DB sync status

Indicates the synchronization status of the database of the primary virtual router.

 

When an event broker is restarted while running Multi-Node Routing, it must synchronize its database with its neighbor event brokers to learn of the subscriptions it will become active for. This value indicates the SMRP synchronization status.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the synchronization status of the database of the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

DB build status

Indicates the current status of the database on the primary virtual router.

 

Whenever redundancy is enabled on an event broker, it can take up to a minute to ready the database for taking activity from its mate event broker on demand.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the database on the primary virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

DB build

Indicates the percentage of time taken by the primary virtual router to ready the database for taking activity from the mate event broker (backup virtual router) on demand.

Percent

A value close to 100 percent indicates that the database is not ready and is taking too long to take activity.

Backup activity

Indicates the current status of the target event broker if the broker is the backup virtual router in a redundant setup

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Shutdown 0
Subscriptions Pending 1
Local Inactive 2
Local Active 3
Master Active 4
Mate Active 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate current status of the target event broker as the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 5.

The detailed diagnosis of this measure lists the name of the VRRP interface, the VRRP address, the VRRP interface role and the routing interface.

Backup routing interface

Indicates the state of the routing interface connecting the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Up 1
Down 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate state of the routing interface connecting the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup messagespool status

Indicates the current status of the message spool in the backup virtual router that is to provide Guaranteed Messaging.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
AD-Disable 0
AD-Not Ready 1
AD-Standby 2
AD-Active 3
AD-Activating 4
Unknown 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the message spool in the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 5.

Backup SMRP status

Indicates the current status of the Subscription Management Routing Protocol (SMRP) on the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the Subscription Management Routing Protocol (SMRP) on the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup ADM card status

Indicates the current status of the ADB on the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the ADB on the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup ADM datapath status

Indicates the current status of the ADB datapath of the backup virtual router.

 

This measure is a good indicator to figure out whether the ADB datapath of the backup virtual router is able to spool messages.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the ADB datapath of the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup flash module status

Indicates the current status of the Flash Memory Module on the ADB linked to the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the Flash Memory Module on the ADB linked to the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup power module status

Indicates the current status of the power module on the ADB linked to the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the power module on the ADB linked to the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup ADM content status

Indicates the current status of the contents of the ADB linked to the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the contents of the ADB linked to the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup disk status

Indicates the current status of the external disk array of the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the external disk array of the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup disk content status

Indicates the current status of the spool file directory on the external disk storage array of the backup virtual router.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the spool file directory on the external disk storage array of the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup DB sync status

Indicates the synchronization status of the database of the backup virtual router.

 

When an event broker is restarted while running Multi-Node Routing, it must synchronize its database with its neighbor event brokers to learn of the subscriptions it will become active for. This value indicates the SMRP synchronization status.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the synchronization status of the database of the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup DB build status

Indicates the current status of the database on the backup virtual router.

 

Whenever redundancy is enabled on an event broker, it can take up to a minute to ready the database for taking activity from its mate event broker on demand.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Ready 1
Not ready 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the database on the backup virtual router. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

Backup DB build

Indicates the percentage of time taken by the backup virtual router to ready the database for taking activity from the mate event broker (backup virtual router) on demand.

Percent

A value close to 100 percent indicates that the database is not ready and is taking too long to take activity.