Db2 Replication Log Gap Test

Recovery Point Objective (RPO) is the maximum tolerable amount of data you can afford to lose in case of a potential DB2 UDB database server crash. Recovery Time Objective is a metric that helps to calculate how quickly you need to recover your Application, database and other services following a disaster (crash) in order to maintain business continuity.

In a high availability setup, the primary and the standby databases should always be in sync. If the primary database crashes before data is synced with the standby databases, then, a significant amount of data will be lost. Generally, administrators do not wish to lose data in case of failures/crashes. To avoid such data loss, it is essential for the administrators to periodically keep track on the amount of data that each standby database is lagging behind i.e., the amount of data that is still more required for the standby database and the primary database to be in sync. Similarly, if data and infrastructure are not recovered following a disaster within the time duration set for the Recovery Time Objective, then, businesses could suffer irreparable data loss and integrity. To avoid such unpleasant eventualities and to ensure that their business is back to normal in a very short duration, administrators may have to periodically keep track on the RPO and RTO of the target DB2 UDB database server.

For each database created on the target DB2 UDB database server, this test reports the amount of data that was lost when a switch over happened and the time lag noticed in the transport of logs between the primary and standby databases. Using this test, administrators can accurately estimate the time and amount of data required for the primary and standby databases to be in sync. This will help administrators fine-tune their high availability environment.

Target of the test : A DB2 database server

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each database created on the target database server instance being monitored

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The IP address of the DB2 server

Port

  • The port number through which the DB2 server communicates. The default port is 50000.
  • User

    Specify the name of the user who is authorized to access the target database server and collect the required metrics in this text box. You can create a separate user on the OS hosting the DB2 server for this purpose. The steps for the same are detailed in the Creating a Special User for Monitoring DB2 Server topic.

    Password

    Enter the password of the specified USER in the PASSWORD text box.

    Confirm Password

    Confirm the Password by retyping it in the Confirm Password text box.

    Database

    Specify the name of the database on the monitored DB2 server to be used by this test.

    Include DB

    Specify a comma-separated list of databases that you wish to monitor in the Include DB text box.

    Exclude DB

    Specify a comma-separated list of databases that need to be excluded from monitoring in the Exclude DB text box.

    SSL

    If the target database server is SSL-enabled, then set the SSL flag to Yes. If not, then set the SSL flag to No.

    Trust Store File Name

    This parameter is applicable only if the target DB2 UDB database is SSL-enabled, if not, set this parameter to none. Specify the file name of the client-side SSL truststore that contains the server certificate required for establishing an SSL connection. The truststore is used to verify the identity of the server and enable a secure communication channel.

    By default, the truststore file should be placed in:<EG_INSTALL_DIR>/jre/lib/security/mytruststore.jks

    Here, mytruststore.jks is the Truststore file name. You may change this to any valid file name. By default, none is specified against this text box.

    Trust Store Password

    This parameter is applicable only if the target DB2 UDB database is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none.

    Confirm Password

    Confirm the Password by retyping it in the Confirm Password text box.

    DD Frequency

    Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

    Detailed Diagnosis

    To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
    Measurements made by the test
    Measurement Description Measurement Unit Interpretation

    Log sequence apply lagging RPO

    Indicates the amount of data ( in terms of bytes) that was lost on this database when a switch over of database happened.

    Bytes

    If too many log gaps are detected in the sequence of the log files, then, it implies that the primary and the standby databases are not up-to-date. A consistent increase in the value of this measure affects the availability of data in the database.

    Log transport lagging durations RTO

    Indicates the time lag noticed in the transport of logs to this database with respect to the generation of logs in the primary database.

    Seconds

    Given enough resources, in particular network bandwidth, a DB2 UDB standby database can maintain pace with very high workloads. In cases where resources are constrained, the standby can begin to fall behind, resulting in a transport or apply lag.

    A transport lag is the amount of data, measured in time, that the standby has not received from the primary.

    A low value is desired for this measure.