Azure PostgreSQL Server Test

Single Server is a fully managed database service with minimal requirements for customizations of the database. The single server deployment model is optimized for built-in high availability, and elasticity at reduced cost. The architecture separates compute and storage. The database engine runs on a proprietary compute container, while data files reside on Azure storage. The storage maintains three locally redundant synchronous copies of the database files ensuring data durability. During planned or unplanned failover events, if the server goes down, the service maintains high availability of the servers using a dedicated automated procedure. The service performs automated patching of the underlying hardware, OS, and database engine. The patching includes security and software updates. The single server service automatically creates server backups and stores them in user configured locally redundant (LRS) or geo-redundant storage. Backups can be used to restore your server to any point-in-time within the backup retention period.

However, to maintain the performance reliability, and availability of the server and applications, it is very vital to keept track of the performance of the database server on a day-to-day basis. This can help in proactively identifying any issues in the resource usage and database operations, which inturn can help you to troubleshoot and optimize your workload. Azure PostgreSQL Server Test helps administrators in this regard.

This test auto-discovers the Single Server instances and reports the CPU usage, memory utilization, network traffic, database connections, and storage of the target server. These measures can inturn help you to promptly detect high workload on the target server, that caused high resource utilization and more number of failed connections. In addition, this test also notifies the administrators on low data read and write rate. These metrics are a clear indication of read and write latency that can in turn cause slow request processing. Hence, using this test, administrators can identify CPU, memory and I/O resource contentions and any latency issues or connection failures in the PostgreSQL single server.

Target of the Test: A PostgreSQL database server on Azure

Agent deploying the test: A remote agent

Output of the test: One set of results for the target server being monitored.

Configurable parameters for the test
Parameters	Description
Test Period	How often should the test be executed.
Host	The host for which the test is to be configured.
Port	The port on which the server is listening.
Subscription ID	Specify the GUID which uniquely identifies the Microsoft Azure Subscription to be monitored. To know the ID that maps to the target subscription, do the following: Login to the Microsoft Azure Portal. When the portal opens, click on the Subscriptions option (as indicated by Figure 1). Figure 1 : Clicking on the Subscriptions option Figure 2 that appears next will list all the subscriptions that have been configured for the target Azure AD tenant. Locate the subscription that is being monitored in the list, and check the value displayed for that subscription in the Subscription ID column. Figure 2 : Determining the Subscription ID Copy the Subscription ID in Figure 2 to the text box corresponding to the SUBSCRIPTION ID parameter in the test configuration page.
Tenant ID	Specify the Directory ID of the Azure AD tenant to which the target subscription belongs. To know how to determine the Directory ID, refer to Configuring the eG Agent to Monitor a Microsoft Azure Subscription Using Azure ARM REST API
Client ID, Client Password, and Confirm Password	To connect to the target subscription, the eG agent requires an Access token in the form of an Application ID and the client secret value. For this purpose, you should register a new application with the Azure AD tenant. To know how to create such an application and determine its Application ID and client secret, refer to Configuring the eG Agent to Monitor a Microsoft Azure Subscription Using Azure ARM REST API. Specify the Application ID of the created Application in the Client ID text box and the client secret value in the Client Password text box. Confirm the Client Password by retyping it in the Confirm Password text box.
Resource Groups Name	Specify the name of the resource group the target server belongs to in this text box.
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measures made by the test:

Measurement

Description

Measurement Unit

Interpretation

Server status

Indicates the current status of the target database server.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Numeric Value	Measure Value
1	Ready
2	Disabled
3	Dropping
4	Inaccessible
0	Unknown

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the target database server.

The detailed diagnosis of this measure, lists the location of the database server, Backup retention days, Geo redundant backup, Storage Autogrow, SSL Enforcement, Public network access, SKU name, SKU Tier, SKU Family, and SKU Capacity.

Total databases

Indicates the total number of databases on the target server.

Number

The detailed diagnosis of this measure shows the database name.

Active connections

Indicates the number of active connections on the target server.

Number

Failed connections

Indicates the number of failed connections on the target server.

Number

Failed connections can occur for the following reasons:

Firewall settings
Connection time-out
Incorrect login information
Maximum limit reached on some Azure Database for PostgreSQL resources
Issues with the infrastructure of the service
Maintenance being performed in the service
The compute allocation of the server is changed by scaling the number of vCores or moving to a different service tier

CPU utilization

Indicates the percentage of CPU utilized to process all the tasks on the target server.

Percent

This measure shed light on to the workload of your Azure Database for PostgreSQL - Server and Azure PostgreSQL process. High CPU percent indicates that the database server has more workload than it can handle.

Memory utilization

Indicates the percentage of memory utilized by the target server.

Percent

This measure indicates the memory utilization from both database workload and other Azure PostgreSQL processes.

Network traffic out

Indicates the total amount of outgoing network traffic on the target server.

This metric includes traffic from your database and from Azure PostgreSQL features like monitoring, logs etc.

Network traffic in

Indicates the total amount of incoming network traffic on the target server.

This metric includes traffic to your database and to Azure PostgreSQL features like monitoring, logs etc.

Backup storage used

Indicates the amount of backup storage used by this server.

This measure indicates the sum of storage that's consumed by all the full backups, differential backups, and log backups that are retained based on the backup retention period that's set for the server. The frequency of the backups is service managed. For geo-redundant storage, backup storage usage is twice the usage for locally redundant storage.

Total storage

Indicates the maximum amount of storage available for this server.

Storage free

Indicates the amount of storage free on the target server.

A high value is desired for this measure.

Storage utilization

Indicates the percentage of storage utilized by the target server.

Percent

The storage that's used by the service can include database files, transaction logs, and server logs.

If the value of Storage utilization measure is close to 100 percent is an indication of storage bottleneck.

If the value of Storage used measure is near to Total storage measure then it implies that the server is running out of storage space.

Storage used

Indicates the amount of storage currently in use on the target server.

IO utilization

Indicates the percentage of IO utilized by the target server.

Percent

I/O utilization takes both read and write IOPS into consideration.

Server log storage utilization

Indicates the percentage of server log storage used out of the server's maximum server log storage.

Percent

If the percentage utilization is high then it is an indication of bottleneck condition.

Server log storage used

Indicates the amount of server log storage used by the target server.

If the value of this measure is near to the maximum storage limit, then it implies that the server is experiencing resource contention.

Server log storage limit

Indicates the maximum amount of server log storage available for the target server.

Replica lag time

Indicates the number of seconds the replica is behind in replaying the transactions received from the source server.

Seconds

This measure is available for replica servers only.

Max lag across replicas

Indicates the lag in bytes between the primary and the most-lagging replica.

This measure is available for the primary server only.