SQL AlwaysOn Recovery Point Test

Recovery Point Objective (RPO) is defined as the amount of acceptable data loss or the point in time up to which the data can be recovered. Whenever a failover is detected, the administrators may want the secondary database to take over quickly from the primary database. If large quantity of data is not transferred to the secondary database from the primary database, then the users have to wait for a longer period to access the databases during failover. Often there would be a minimal data loss when a failover is in progress. This data loss may be due to the time lag that occurs during synchronization that happens between the primary and secondary databases. If the time taken is too long, it indicates that the synchronization process between the primary and secondary databases is taking too long to complete. This in turn will affect the users who will be compelled to wait for a prolonged time period to access the databases. To avoid such scenarios, it is essential to monitor the recovery point objective of the SQL server. The SQL AlwaysOn Recovery Point test helps administrators in this regard.

This test reports the amount of logs that had not been synchronized with the secondary database and the amount of hardened logs that are yet to be applied to the secondary database. In addition, this test helps administrators to analyze the time duration for which the log records were waiting in the redo queue before being rolled to the secondary database. This way, administrators may be proactively alerted to fine tune the time taken to roll the log to the secondary database so that the synchronization process completes in a quick and hassle free manner.

This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Microsoft SQL as the desired Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the < button to move the test to the ENABLED TESTS list. Finally, click the Update button.

Target of the test : A Microsoft SQL server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each database on the Microsoft SQL server monitored

Configurable parameters for the test
TEST PERIOD - How often should the test be executed Host – The IP address of the Microsoft SQL server. Port - The port number through which the Microsoft SQL server communicates. The default port is 1433. ssl – If the Microsoft SQL server being monitored is an SSL-enabled server, then set the ssl flag to Yes. If not, then set the ssl flag to No. instance - In this text box, enter the name of a specific Microsoft SQL instance that is to be monitored. The default value of this parameter is “default”. To monitor a Microsoft SQL instance named “CFS”, enter this as the value of the INSTANCE parameter. USER – If a Microsoft SQL Server 7.0/2000 is monitored, then provide the name of a SQL user with the Sysadmin role in this text box. While monitoring a Microsoft SQL Server 2005 or above, provide the name of a SQL user with all of the privileges outlined in User Privileges Required for Monitoring Microsoft SQL server. password - The password of the specified user confirm password - Confirm the password by retyping it. domain - By default, none is displayed in the DOMAIN text box. If the ‘SQL server and Windows’ authentication has been enabled for the server being monitored, then the DOMAIN can continue to be none. On the other hand, if ‘Windows only’ authentication has been enabled, then, in the DOMAIN text box, specify the Windows domain in which the managed Microsoft SQL server exists. Also, in such a case, the USER name and PASSWORD that you provide should be that of a user authorized to access the monitored SQL server. isntlmv2 - In some Windows networks, NTLM (NT LAN Manager) may be enabled. NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users. NTLM version 2 (“NTLMv2”) was concocted to address the security issues present in NTLM. By default, the isntlmv2 flag is set to No, indicating that NTLMv2 is not enabled by default on the target Microsoft SQL host. Set this flag to Yes if NTLMv2 is enabled on the target host.

Measurements made by the test
Measurement	Description	Measurement Unit	Interpretation
Log bytes flushed:	Indicates the rate at which log bytes were flushed to the secondary database to complete synchronization since the last recovery point.	Flushes/sec	If the value of this measure is consistently increasing, then it indicates that the potential data loss can increase indefinitely.
Log send queue size:	Indicates the amount of log that had not been sent to the secondary database from this database to complete synchronization.	KB	Ideally, the value of this measure should be zero. A high value for this measure indicates that this much of data is unavailable in the secondary database during failover which directly implies that the customers would experience this data loss equal to this measure.
Redo queue size:	Indicates the total number of kilobytes of hardened log that currently remain to be applied to the secondary database to roll it forward.	KB	A low value is desired for this measure.
Redo rate:	Indicates the rate at which log records were rolled forward on the secondary database from this database.	KB/sec
Pending logs recovery time:	Indicates the time duration for which the log records were waiting in the redo queue until being rolled forward to the secondary database.	Secs	Ideally, the value of this measure should be low.
Pending logs flushed time:	Indicates the time duration for which the logs were in the send queue until being flushed completely to the secondary database.	Secs