SQL AlwaysOn Replica Status Test

An availability replica is an instantiation of an availability group that is hosted by a specific instance of SQL Server and maintains a local copy of each availability database that belongs to the availability group. There are two types of availability replicas that exist in the availability group: single primary replica and one to eight secondary replicas.

An availability group fails over at the level of an availability replica. An availability replica provides redundancy only at the database level—for the set of databases in one availability group. Failovers are not caused by database issues such as a database becoming suspect due to a loss of a data file or corruption of a transaction log. The primary replica makes the primary databases available for read-write connections from clients. Also, in a process known as data synchronization, which occurs at the database level. The primary replica sends transaction log records of each primary database to every secondary database. Every secondary replica caches the transaction log records (hardens the log) and then applies them to its corresponding secondary database.

Whenever a failover is detected, the administrators may want the secondary replica to take over quickly from the primary replica. If the primary replica and secondary replicas are not in a position to apply the transaction logs to the primary and secondary databases, then there may be too much of non-sync between the primary replica and the secondary replicas during failover. In order to minimize such synchronization problems and maintain the secondary replicas on par with the primary replica, administrators are required to continuously monitor the operational state, synchronization status and synchronization health of the availability replicas. The SQL AlwaysOn Replica Status test helps administrators in this regard! This test continuously monitors the operational state, synchronization state and synchronization health of each availability replica of the SQL AlwaysOn Availability groups. In addition, administrators would be alerted to the current recovery state of the availability replica after a failover is initiated.

This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Microsoft SQL as the desired Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the < button to move the test to the ENABLED TESTS list. Finally, click the Update button.

Target of the test : A Microsoft SQL server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each availability replica on the Microsoft SQL server that is being monitored

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The IP address of the Microsoft SQL server.
  3. Port - The port number through which the Microsoft SQL server communicates. The default port is 1433.
  4. ssl – If the Microsoft SQL server being monitored is an SSL-enabled server, then set the ssl flag to Yes. If not, then set the ssl flag to No.
  5. instance - In this text box, enter the name of a specific Microsoft SQL instance that is to be monitored. The default value of this parameter is “default”. To monitor a Microsoft SQL instance named “CFS”, enter this as the value of the INSTANCE parameter.
  6. USER – If a Microsoft SQL Server 7.0/2000 is monitored, then provide the name of a SQL user with the Sysadmin role in this text box. While monitoring a Microsoft SQL Server 2005 or above, provide the name of a SQL user with all of the privileges outlined in User Privileges Required for Monitoring Microsoft SQL server.

  7. password - The password of the specified user
  8. confirm password - Confirm the password by retyping it.
  9. domain - By default, none is displayed in the DOMAIN text box. If the ‘SQL server and Windows’ authentication has been enabled for the server being monitored, then the DOMAIN can continue to be none. On the other hand, if ‘Windows only’ authentication has been enabled, then, in the DOMAIN text box, specify the Windows domain in which the managed Microsoft SQL server exists. Also, in such a case, the USER name and PASSWORD that you provide should be that of a user authorized to access the monitored SQL server.
  10. isntlmv2 - In some Windows networks, NTLM (NT LAN Manager) may be enabled. NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users. NTLM version 2 (“NTLMv2”) was concocted to address the security issues present in NTLM. By default, the isntlmv2 flag is set to No, indicating that NTLMv2 is not enabled by default on the target Microsoft SQL host. Set this flag to Yes if NTLMv2 is enabled on the target host.
  11. DETAILED DIAGNOSIS – To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Operational state:

Indicates the current operational state of this availability replica.

 

The values reported by this measure and their numeric equivalents are available in the table below:

Measure Value Numeric Value
FAILED_NO_QUORUM 0
FAILED 1
OFFLINE 2
PENDING_FAILOVER 3
PENDING 4
ONLINE 5

Note:

This measure reports the Measure Values listed in the table above to indicate the states of this availability replica. However, in the graph, this measure is indicated using the Numeric Values listed in the above table.

Recovery state:

Indicates the current recovery state of this availability replica.

 

The values reported by this measure and their numeric equivalents are available in the table below:

Measure Value Numeric Value
ONLINE_IN_PROGRESS 0
ONLINE 1

Note:

This measure reports the Measure Values listed in the table above to indicate the recovery states of this availability replica. However, in the graph, this measure is indicated using the Numeric Values listed in the above table.

Synchronization health state:

Indicates the health of this availability replica during synchronization.

 

The values reported by this measure and their numeric equivalents are available in the table below:

Measure Value

Numeric Value

NOT_HEALTHY 0
PARTIALLY_HEALTHY 1
HEALTHY 2

Note:

This measure reports the Measure Values listed in the table above to indicate the health status of the availability replica during synchronization. However, in the graph, this measure is indicated using the Numeric Values listed in the above table.

Connected state:

Indicates the connection state of this availability replica with the primary/secondary availability replica.

 

The values reported by this measure and their numeric equivalents are available in the table below:

Measure Value Numeric Value
DISCONNECTED 0
CONNECTED 1

Note:

This measure reports the Measure Values listed in the table above to indicate the connection state between the primary and secondary replicas. However, in the graph, this measure is indicated using the Numeric Values listed in the above table.

The Detailed diagnosis of this measure if enabled, lists the ReplicaID, IsLocal, Role, Last connect error number, Last connect error description, and the Last connect error time.