Db2 Replication Log Gap Test

Recovery Point Objective (RPO) is the maximum tolerable amount of data you can afford to lose in case of a potential DB2 UDB database server crash. Recovery Time Objective is a metric that helps to calculate how quickly you need to recover your Application, database and other services following a disaster (crash) in order to maintain business continuity.

In a high availability setup, the primary and the standby databases should always be in sync. If the primary database crashes before data is synced with the standby databases, then, a significant amount of data will be lost. Generally, administrators do not wish to lose data in case of failures/crashes. To avoid such data loss, it is essential for the administrators to periodically keep track on the amount of data that each standby database is lagging behind i.e., the amount of data that is still more required for the standby database and the primary database to be in sync. Similarly, if data and infrastructure are not recovered following a disaster within the time duration set for the Recovery Time Objective, then, businesses could suffer irreparable data loss and integrity. To avoid such unpleasant eventualities and to ensure that their business is back to normal in a very short duration, administrators may have to periodically keep track on the RPO and RTO of the target DB2 UDB database server.

For each database created on the target DB2 UDB database server, this test reports the amount of data that was lost when a switch over happened and the time lag noticed in the transport of logs between the primary and standby databases. Using this test, administrators can accurately estimate the time and amount of data required for the primary and standby databases to be in sync. This will help administrators fine-tune their high availability environment.

Target of the test : A DB2 database server

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each database created on the target database server instance being monitored

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

Specify the IP address of the DB2 server in this text box.

Port

Specify the port at which the target host is listening. The default port is 50000.

Username, Password and Confirm Password

To monitor Db2 UDB database server, the eG agent should be configured with the credentials of a user with any of the following privileges SYSADM or SYSCTRL or SYSMAINT or SYSMON. Specify the credentials of such a user in the User and Password text boxes. Confirm the Password by retyping it in the Confirm Password text box.

Database

The test uses a database on the monitored Db2 UDB server. Specify the name of the database in the Database text box.

SSL

If the target database server is SSL-enabled, then set the SSL flag to Yes. If not, then set the SSL flag to No.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Log sequence apply lagging RPO

Indicates the amount of data ( in terms of bytes) that was lost on this database when a switch over of database happened.

Bytes

If too many log gaps are detected in the sequence of the log files, then, it implies that the primary and the standby databases are not up-to-date. A consistent increase in the value of this measure affects the availability of data in the database.

Log transport lagging durations RTO

Indicates the time lag noticed in the transport of logs to this database with respect to the generation of logs in the primary database.

Seconds

Given enough resources, in particular network bandwidth, a DB2 UDB standby database can maintain pace with very high workloads. In cases where resources are constrained, the standby can begin to fall behind, resulting in a transport or apply lag.

A transport lag is the amount of data, measured in time, that the standby has not received from the primary.

A low value is desired for this measure.