PostgreSQL Replication RPO Test

Recovery Point Objective (RPO) is the maximum tolerable amount of data you can afford to lose in case of a potential PostgreSQL database server crash. In a high availability setup, the master and the replica servers should always be in sync. If the master database server crashes before data is synced with the replica servers, then, a significant amount of data will be lost. Generally, administrators do not wish to lose data in case of failures/crashes. To avoid such data loss, it is essential for the administrators to periodically keep track on the amount of data that each replica server is lagging behind i.e., the amount of data that is still more required for the replica server and the master to be in sync. The PostgreSQL Replication RPO test helps administrators in this regard!

In a high availability setup, this test auto-discovers the replica servers connecting to the target PostgreSQL database server (which is the master) and for each replica server, reports the amount of data that is yet to be synced from the master. Using this test, administrators can figure out the replica server that is more vulnerable to data loss in case of potential server crash and fine-tune backup schedules accordingly.

Note:

This test will report metrics only if the monitored target PostgreSQL database server is the master in a high availability setup.

Target of the test : PostgreSQL server

Agent deploying the test: An internal/remote agent

Outputs of the test : One set of results for each replica server connecting to the target PostgreSQL server

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The IP address of the host for which this test is to be configured.

Port

The port on which the server is listening. The default port is 5432.

Username

In order to monitor a PostgreSQL server, you need to manually create a special database user account in every PostgreSQL database instance that requires monitoring. When doing so, ensure that this user is vested with the superuser privileges. The sample script we recommend for user creation for eG monitoring is:

CREATE ROLE eguser LOGIN

ENCRYPTED PASSWORD {‘eguser password’}

SUPERUSER NOINHERIT NOCREATEDB NOCREATEROLE;

Specify the name of this user in the Username text box.

Password

The password associated with the above Username (can be ‘NULL’). Here, ‘NULL’ means that the user does not have any password.

Confirm Password

Confirm the Password (if any) by retyping it here.

DB Name

The name of the database to connect to. The default is “postgres”.

SSL

If the PostgreSQL server being monitored is an SSL-enabled server, then set the SSL flag to Yes. If not, then set the SSL flag to No.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Replication lag size

Indicates the amount of data that is yet to be synced to this replica server.

MB

Compare the value of this measure across clients to figure out the replica server that is more vulnerable to data loss.

The detailed diagnosis of this measure lists the Process ID, OID of the user, User name, Application name, Sent location and Replay location.