SQL AlwaysOn Network Latency Test

If transaction log records are not sent quickly by the primary database or are not applied quickly by the secondary database, then the data in the primary and secondary databases will be out of sync; this will cause significant data loss during a failover. To avoid this, administrators must keep track of the log record traffic between the primary and secondary databases, proactively detect potential slowness in synchronization, figure out the probable source of the bottleneck, and clear it to ensure proper synchronization between the primary and secondary databases. This is where the SQL AlwaysOn Network Latency test helps.

This test measures the rate at which transaction log data is sent to the secondary database for synchronization on each SQL server instance, and the time taken by the secondary database to apply the data. In the process, the test pinpoints bottlenecks in database synchronization and where exactly the bottlenecks lie.

This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Microsoft SQL as the desired Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the < button to move the test to the ENABLED TESTS list. Finally, click the Update button.

Target of the test : A Microsoft SQL server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each database on the Microsoft SQL server instance being monitored

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The IP address of the Microsoft SQL server.
  3. Port - The port number through which the Microsoft SQL server communicates. The default port is 1433.
  4. ssl – If the Microsoft SQL server being monitored is an SSL-enabled server, then set the ssl flag to Yes. If not, then set the ssl flag to No.
  5. instance - In this text box, enter the name of a specific Microsoft SQL instance that is to be monitored. The default value of this parameter is “default”. To monitor a Microsoft SQL instance named “CFS”, enter this as the value of the INSTANCE parameter.
  6. USER – If a Microsoft SQL Server 7.0/2000 is monitored, then provide the name of a SQL user with the Sysadmin role in this text box. While monitoring a Microsoft SQL Server 2005 or above, provide the name of a SQL user with all of the privileges outlined in User Privileges Required for Monitoring Microsoft SQL server.

  7. password - The password of the specified user.
  8. confirm password - Confirm the password by retyping it.
  9. domain - By default, none is displayed in the DOMAIN text box. If the ‘SQL server and Windows’ authentication has been enabled for the server being monitored, then the DOMAIN can continue to be none. On the other hand, if ‘Windows only’ authentication has been enabled, then, in the DOMAIN text box, specify the Windows domain in which the managed Microsoft SQL server exists. Also, in such a case, the USER name and PASSWORD that you provide should be that of a user authorized to access the monitored SQL server.
  10. isntlmv2 - In some Windows networks, NTLM (NT LAN Manager) may be enabled. NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users. NTLM version 2 (“NTLMv2”) was concocted to address the security issues present in NTLM. By default, the isntlmv2 flag is set to No, indicating that NTLMv2 is not enabled by default on the target Microsoft SQL host. Set this flag to Yes if NTLMv2 is enabled on the target host.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Log sent:

Indicates the amount of data (in bytes) sent from the primary availability replica to the secondary availability replica per second during the last measurement period.

KB/sec

 

Log transport:

Indicates the amount of data (in bytes) sent over the network from the primary availability replica to the secondary availability replica per second during the last measurement period.

KB/sec

 

Log send wait time:

Indicates the time duration for which the log stream messages were waiting in the Flow Control mode per second.

Msecs/sec

Ideally, the value of this measure should be low. A gradual/sudden increase in this measure indicates that the network over which the log messages are sent is experiencing slowdowns/network delays and noise. A high value for this measure is also indicative of potential data loss which is much more than the estimated Recovery Point Objective (RPO).

Log send waits:

Indicates the number of times Flow Control mode was initiated per second.

Waits/sec

A high value for this measure indicates that the network is congested and is experiencing slowdowns.

Avg log send wait time:

Indicates the average time the log messages should wait in the Flow Control mode.

Secs/Wait

This measure is a ratio of the Log send wait time and the Log send waits measures.

A low value is desired for this measure.

Alwayson messages resent:

Indicates the number of Always on messages i.e., log stream messages that were resent over the network during the last measurement period.

Number

Ideally, the value of this measure should be low.

A high value for this measure is a cause of concern as this indicates a high network latency or network congestion or network noise.