SQL AlwaysOn Network Latency Test
If transaction log records are not sent quickly by the primary database or are not applied quickly by the secondary database, then the data in the primary and secondary databases will be out of sync; this will cause significant data loss during a failover. To avoid this, administrators must keep track of the log record traffic between the primary and secondary databases, proactively detect potential slowness in synchronization, figure out the probable source of the bottleneck, and clear it to ensure proper synchronization between the primary and secondary databases. This is where the SQL AlwaysOn Network Latency test helps.
This test measures the rate at which transaction log data is sent to the secondary database for synchronization on each SQL server instance, and the time taken by the secondary database to apply the data. In the process, the test pinpoints bottlenecks in database synchronization and where exactly the bottlenecks lie.
This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Microsoft SQL as the desired Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the < button to move the test to the ENABLED TESTS list. Finally, click the Update button.
Target of the test : A Microsoft SQL server
Agent deploying the test : An internal agent
Outputs of the test : One set of results for each database on the Microsoft SQL server instance being monitored
|
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Log sent: |
Indicates the amount of data (in bytes) sent from the primary availability replica to the secondary availability replica per second during the last measurement period. |
KB/sec |
|
Log transport: |
Indicates the amount of data (in bytes) sent over the network from the primary availability replica to the secondary availability replica per second during the last measurement period. |
KB/sec |
|
Log send wait time: |
Indicates the time duration for which the log stream messages were waiting in the Flow Control mode per second. |
Msecs/sec |
Ideally, the value of this measure should be low. A gradual/sudden increase in this measure indicates that the network over which the log messages are sent is experiencing slowdowns/network delays and noise. A high value for this measure is also indicative of potential data loss which is much more than the estimated Recovery Point Objective (RPO). |
Log send waits: |
Indicates the number of times Flow Control mode was initiated per second. |
Waits/sec |
A high value for this measure indicates that the network is congested and is experiencing slowdowns. |
Avg log send wait time: |
Indicates the average time the log messages should wait in the Flow Control mode. |
Secs/Wait |
This measure is a ratio of the Log send wait time and the Log send waits measures. A low value is desired for this measure. |
Alwayson messages resent: |
Indicates the number of Always on messages i.e., log stream messages that were resent over the network during the last measurement period. |
Number |
Ideally, the value of this measure should be low. A high value for this measure is a cause of concern as this indicates a high network latency or network congestion or network noise. |