High Availability Test

A high availability (HA) deployment of two Citrix NetScaler appliances can provide uninterrupted operation in any transaction. With one appliance configured as the primary node and the other as the secondary node, the primary node accepts connections and manages servers while the secondary node monitors the primary. If, for any reason, the primary node is unable to accept connections, the secondary node takes over.

The secondary node monitors the primary by sending periodic messages (often called heartbeat messages or health checks) to determine whether the primary node is accepting connections. If a health check fails, the secondary node retries the connection for a specified period, after which it determines that the primary node is not functioning normally. The secondary node then takes over for the primary (a process called failover).

When monitoring a NetScaler appliance, you may first want to know whether the target NetScaler has been deployed in an HA setup or not. If so, you may then want to time and again evaluate the effectiveness of the HA configuration, by checking the following:

  • Is the target NetScaler appliance the primary node or the secondary node of the HA setup?
  • What is the current state of the monitored NetScaler appliance? Did failover occur when the primary node went down?
  • Are both nodes able to communicate freely with each other via heartbeat packets?
  • Is the configuration of both the primary and secondary managers in sync?

The High Availability test does all the above, and more!

Target of the test : A NetScaler VPX/MPX

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the NetScaler appliance being monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed

Host

The IP address of the host for which the test is being configured.

NetScaler Username and NetScaler Password

To monitor a NetScaler device, the eG agent should be configured with the credentials of a user with read-only privileges to the target NetScaler device. Specify the credentials of such a user in the NetScaler Username and NetScaler Password text boxes.

Confirm Password

Confirm the NetScaler Password by retyping it here.

SSL

The eG agent collects performance metrics by invoking NITRO (NetScaler Interface Through Restful interfaces and Objects) APIs on the target NetScaler device. Typically, the NITRO APIs can be invoked through the HTTP or the HTTPS mode. By default, the eG agent invokes the NITRO APIs using the HTTPS mode. This is why, the SSL flag is set to Yes by default. If the target NetScaler device is not SSL-enabled, then the NITRO APIs can be accessed through the HTTP mode only. In this case, set the SSL flag to No.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Is HA enabled?

Indicates whether/not the target NetScaler appliance is enabled for High Availability.

 

This measure reports the value Yes if 'High availability' is enabled for the monitored node, and returns the value No if 'High availability' is not configured for the node - i.e., if the node is not part of a 'High Availability' setup.

The values reported by this measure and their numeric equivalents are as shown in the table:

Numeric Value Measure Value
0 Yes
1 No

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the high availability state of the current node. However, in the graph of this, the HA states will be represented using the corresponding numeric equivalents i.e., 0 or 1.

HA node state

Indicates the current state of the highly available node.

 

The values that this measure can report and their numeric equivalents are as shown in the table below:

Numeric Value Measure Value

0

Up

1

Init

2

Disabled

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the current state of the highly available node. However, in the graph of this measure, the state will be represented using the corresponding numeric equivalents only - i.e., 0 to 2.

HA role of this node

Indicates the current role of this NetScaler device in a high availability setup.

 

The values that this measure can report and their corresponding numeric equivalents are shown in the table below:

Measure Value Numeric Value
Primary 0
Secondary 1
Stay Secondary 2

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether/not this NetScaler device is the master. However, in the graph of this measure, the states will be represented using the corresponding numeric equivalents only - i.e., 0 to 2.

Heartbeats received

Indicates the number of heartbeat packets received from the peer node - i.e., from the secondary NetScaler devices in a high availability setup - during the last measurement period.

Number

The heartbeat messages are UDP packets sent to port 3003 of the other node in an HA pair. These heartbeats are sent at regular intervals (default is 200 milliseconds) to determine the state of the peer node.

Heartbeats sent

Indicates the number of heartbeat packets sent to the peer node by the master NetScaler device during the last measurement period.

Number

Propagation timed out

Indicates the number of times the command propagation from the primary to the secondary nodes timed out during the last measurement period.

Number

Command propagation is a feature of the NetScaler appliance that ensures that the commands run on the primary NetScaler appliance of the high availability setup are automatically run on the secondary NetScaler appliance. When you run a command on the primary appliance, this feature ensures that the command runs on the secondary appliance before it runs on the primary appliance.

If command execution fails on the secondary or times out when executing on the secondary, it may cause a non-sync between the configuration of the primary and the secondary.

Synch failure

Indicates the number of times the configuration of the primary and secondary nodes failed to synchronize during the last measurement period.

Number

Synchronization is a process of duplicating the configuration of the primary node on the secondary node. The purpose of synchronization is to ensure that there is no loss of configuration information between the primary and the secondary nodes, regardless of the number of failovers that occur. 

Synchronization is triggered by either of the following circumstances:

  1. The secondary node in an HA setup comes up after a restart.
  2. The primary node becomes secondary after a failover.

A synchronization failure results in mismatched configuration. It can be caused by a mismatch in the Remote Procedural Call (RPC) password on the two nodes forming the high availability pair.

Has the master role been changed?

Indicates whether/not the state of the monitored appliance changed since the last measurement period.

 

The values that this measure can report and their corresponding numeric equivalents are shown in the table below:

Measure Value Numeric Value
No 0
Yes 1

If the state of the appliance changes from primary to secondary or secondary to primary, then the value of this measure will be Yes. If the state of the appliance remains unchanged since the previous measurement period, then the value of this measure will be No.

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether/not the state of the master is changed. However, in the graph of this measure, the states will be represented using the corresponding numeric equivalents only.