Asynchronous Thread Queue Test

LSASS (Local Security Authority Subsystem Service) – the Windows process that is responsible for enforcing the security policy on the system - adopted its threading library from IIS to handle Windows socket communication, and uses the asynchronous thread queue to handle requests from Kerberos and LDAP.

Monitoring the asynchronous thread queue (ATQ) on an AD server will provide useful pointers to the request processing ability of the server. This test monitors the ATQ, reports the number and nature of requests queued in the ATQ, captures a steady growth (if any) in the length of the queue over time, and thus reveals potential processing bottlenecks on the AD server.

Note:

This test applies only to Active Directory Servers installed on Windows 2008.

Target of the test : An Active Directory server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for every Active Directory server being monitored

Configurable parameters for the test
Parameters Description

Test period

This indicates how often should the test be executed.

Host

The IP address of the machine where the Active Directory is installed.

Port

The port number through which the Active Directory communicates. The default port number is 389.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

ATQ estimated queue delay

Indicates how long the request has to wait in the queue.

Secs

This is the estimated time the next request will spend in the queue prior to being serviced by the directory service.

ATQ outstanding queued requests

Indicate the number of requests currently in the queue.

Number

A high level of queuing indicates that requests are arriving at the domain controller faster than they can be processed. This can also lead to a high latency in responding to requests.

ATQ request latency

Indicates the time it takes to process an enqueued request.

Secs

Since the type of requests can differ, the value of this measure is typically not significant, as its an average value. An expensive LDAP query that takes minutes to execute can be masked by hundreds of fast LDAP queries or KDC requests.

The main use of this measure therefore is to monitor the wait time in queue. Any non-zero value indicates that DC has run out of threads.

ATQ threads ldap

Indicates the number of threads used by the LDAP server as determined by LDAP policy.

Number

This measure indicates the number of threads currently servicing LDAP requests. If the value of this measure is unusually high, then look for the following:

  • Expensive or Inefficient LDAP queries
  • Excessive numbers of LDAP queries
  • An Insufficient number of DCs to service the workload (or existing DCs are undersized)
  • Memory, CPU or disk bottlenecks on the DC

Large values for this measure are common but the thread count should remain less than the the value of the ATQ threads total measure.

Also, note that this measure could also report abnormally high values for reasons that are initially triggered by LDAP but are ultimately affected by external reasons. Such reasons are as follows:

  • If the IP address of LDAP pings from clients does not map to an AD site: In this case, the LDAP server performs an exhaustive address lookup to discover additional client IP addresses so that it may find a site to map to the client.
  • If the DC supports LDAP over SSL / TLS: In this case, a user sends a certificate on a session. The server needs to check for certificate revocation which may take some time. This becomes problematic if network communication is restricted and the DC cannot reach the Certificate Distribution Point (CDP) for a certificate.

ATQ thread others

Indicates the number of threads used by other component, in this case KDC.

Number

You can also have external dependencies generating requests that hit the Kerberos Key Distribution Center (KDC). One common operation is getting the list of global and universal groups from a DC that is not a Global Catalog (GC). A 2nd external and potentially intermittent root cause occurs when the Kerberos Forest Search Order (KFSO) feature has been enabled on Windows Server 2008 R2 and later KDCs to search trusted forests for SPNs that cannot be located in the local forest. The worst case scenario occurs when the KDC searches both local and trusted forests for an SPN that can’t be found either because the SPN does not exist or because the search focused on an incorrect SPN.

Memory dumps from in-state KDCs will reveal a number of threads working on Kerberos Service Ticket Requests along with pending RPC calls to remote domain controllers.

ATQ threads total

Indicates the total number of threads that are currently allocated.

Number

This measure tracks the total number of threads from the ATQ threads ldap and ATQ threads other measures. The maximum number of threads that a given DC can apply to incoming workloads can be found my multiplying the product of MaxPoolThreads times the number of logical CPU cores. MaxPoolThreads defaults to a value of 4 in LDAP Policy and should not be modified without understanding the implications.

Compare the value of this measure with the value of the ATQ threads ldap and ATQ threads other measures. If the ATQ threads ldap measure equals this measure in value then it implies that all of the LDAP listen threads are stuck processing LDAP requests currently. If the ATQ threads other measure equals this measure in value, then it means that all of the LDAP listen threads are busy responding to Kerberos related traffic.

Similarly, note how close the current value for this measure is to the max value recorded in the trace and whether both values are using the maximum number of threads supported by the DC being monitored. If so, it’s a sure sign of a potential overload.