Db2 Replication Heartbeats Test
A 'heartbeat' is a signal sent between a primary database and standby database. This signal is taken as a sign of vitality. If there is no response to the signal, then it is understood that there are certain health issues/ technical problems with the primary database.
If the standby database does not receive any heartbeats from a primary database for a certain timeout period, then a 'Heartbeat Lost' condition occurs and the corresponding standby database is deemed to be dead/unavailable.
To avoid the loss of heartbeats and the consequent failure of a standby database, administrators must keep a close watch on the heartbeats sent by the primary database to each standby database, detect issues in the transmission of heartbeats, and clear the bottlenecks well before the configured timeout period expires and the standby database is declared dead. This can be achieved using the Db2 Replication Heartbeats test!
This test monitors the heartbeats that each DB2 UDB primary database sends to the standby database. In the process, the test reports the count of heartbeats that were missed during a measure period, the count of heartbeats expected during a measure period and the percentage of heartbeats missed between the primary and standby database. Alerts are promptly sent out if too many heartbeats are missed. This way, administrators can proactively detect problems in heartbeat communication and can resolve them before the standby databases die.
Target of the test : A DB2 database server
Agent deploying the test : An internal/remote agent
Outputs of the test : One set of results for each database instance created on the target database server being monitored
| Parameter | Description |
|---|---|
|
Test period |
How often should the test be executed |
|
Host |
The IP address of the DB2 server |
|
Port |
|
|
User |
Specify the name of the user who is authorized to access the target database server and collect the required metrics in this text box. You can create a separate user on the OS hosting the DB2 server for this purpose. The steps for the same are detailed in the Creating a Special User for Monitoring DB2 Server |
|
Password |
Enter the password of the specified USER in the PASSWORD text box. |
|
Confirm Password |
Confirm the Password by retyping it in the Confirm Password text box. |
|
Database |
Specify the name of the database on the monitored DB2 server to be used by this test. |
|
Include DB |
Specify a comma-separated list of databases that you wish to monitor in the Include DB text box. |
|
Exclude DB |
Specify a comma-separated list of databases that need to be excluded from monitoring in the Exclude DB text box. |
|
SSL |
If the target database server is SSL-enabled, then set the SSL flag to Yes. If not, then set the SSL flag to No. |
|
Trust Store File Name |
This parameter is applicable only if the target DB2 UDB database is SSL-enabled, if not, set this parameter to none. Specify the file name of the client-side SSL truststore that contains the server certificate required for establishing an SSL connection. The truststore is used to verify the identity of the server and enable a secure communication channel. By default, the truststore file should be placed in:<EG_INSTALL_DIR>/jre/lib/security/mytruststore.jks Here, mytruststore.jks is the Truststore file name. You may change this to any valid file name. By default, none is specified against this text box. |
|
Trust Store Password |
This parameter is applicable only if the target DB2 UDB database is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none. |
|
Confirm Password |
Confirm the Password by retyping it in the Confirm Password text box. |
| Measurement | Description | Measurement Unit | Interpretation |
|---|---|---|---|
|
Heartbeats missed |
Indicates the count of heartbeats missed by this database during the last measurement period. |
Number |
If the value of this measure is zero, it indicates that no heartbeats have been missed and the connection is healthy. The higher the value, the worse the condition of the connection. |
|
Heartbeats expected |
Indicates the number of heartbeats expected by this database during the last measurement period. |
Number |
|
|
Miss ratio |
Indicates the percentage of heartbeats missed by this database. |
Percentage |
A low value is desired for this measure. A sudden/gradual increase in the value of this measure indicates that the standby database is losing its connection with the primary database. |