Mongo Thread Statistics Test

Threads are the 'work horses' of a MongDB server. Every job that the server does for an application - reading pages, writing pages, eviction etc. - are performed by application threads. Administrators need to be mindful of the usage of these threads, as abnormal usage is considered to be the herald of an overload condition or a potential contention on the server. To monitor thread usage and proactively detect such problem conditions, administrators can use the Mongo Thread Statistics test. This test tracks the usage of application threads and reports the count of threads used for various activities. This way, the test points to those activities in which the maximum number of threads are actively engaged - is it fsync? reading? writing? In the event of a thread contention, these analytics will help administrators figure out where maximum threads are spent. Additionally, the test reveals how much time the threads take to perform cache eviction and how much time the cache waits for a thread to become available. If cache requests are not serviced quickly, these metrics will tell administrators why - is it because enough threads are not available to the cache?

Target of the test : A MongoDB server

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for the MongoDB server monitored.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed.

Host

The host for which the test is to be configured.

Port

The port number at which the specified host listens.

Database Name

The test connects to a specific Mongo database to run API commands and pull metrics of interest. Specify the name of this database here. The default value of this parameter is admin.

Username and Password

The eG agent has to be configured with the credentials of a user who has the required privileges to monitor the target MongoDB instance, if the MongoDB instance is access control enabled. To know how to create such a user, refer to How to monitor access control enabled MongoDB database?If the target MongoDB instance is not access control enabled, then, specify none against the Username and Password parameters.

Confirm Password

Confirm the password by retyping it here.

Authentication Mechanism

Typically, the MongoDB supports multiple authentication mechanisms that users can use to verify their identity. In environments where multiple authentication mechanisms are used, this test enables the users to select the authentication mechanism of their interest using this list box. By default, this is set to None. However, you can modify this settings as per the requirement.

SSL

By default, the SSL flag is set to No, indicating that the target MongoDB server is not SSL-enabled by default. To enable the test to connect to an SSL-enabled MongoDB server, set the SSL flag to Yes.

CA File

A certificate authority (CA) file contains root and intermediate certificates that are electronically signed to affirm that a public key belongs to the owner named in the certificate. If you are looking to monitor the certificates contained within a CA file, then provide the full path to this file in the CA File text box. For example, the location of this file may be: C:\cert\rootCA.pem. If you do not want to monitor the certificates in a CA file, set this parameter to none.

Certificate Key File

A Certificate Key File specifies the path on the server where your private key is stored. If you are looking to monitor the Certificate Key File, then provide the full path to this file in the Certificate Key File text box. For example, the location of this file may be: C:\cert\mongodb.pem. If you do not want to monitor the certificates in a CA file, set this parameter to none.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Active threads in fsync

Indicates the number of threads that were actively engaged in fsync operations during the last measurement period.

Number

As applications write data, MongoDB records the data in the storage layer and then writes the data to disk within the syncPeriodSecs interval, which is 60 seconds by default. Run fsync when you want to flush writes to disk ahead of that interval.

Active threads in read

Indicates the number of threads that were actively engaged in read operations during the last measurement period.

Number

If users to MongoDB complain of slowness, then you can compare the value of these two measures with that of the Active threads in fsync measure to know which operation is hogging threads - fsync? reads? or writes? In the event of a contention/slowness, these metrics will tell you which activities require more threads.

Active threads in write

Indicates the number of threads that were actively engaged in write operations during the last measurement period.

Number

Time taken for evicting threads

Indicates the time taken by threads to perform cache eviction.

Seconds

A high value is a cause for concern, as it implies that the threads are taking too long to evict objects from the cache and free space in it. This could be because adequate threads are not engaged in eviction. Where a WiredTiger storage engine is used, to make sure eviction is smooth and quick, you may want to fine-tune one/more of these parameters:

  • The threads_min and threads_max parameters for the 'eviction' operation can be increased, so that more threads perform eviction, thereby reducing time taken to evict.

  • Increase the eviction_target, so that worker threads start evicting pages from the cache a lot later; until such time, worker threads will be available for other operations. This can ease thread contention.

  • Increase the eviction_trigger, so that application threads are not called into the eviction process soon. This releases application threads, so they are available to perform other operations. This again can reduce thread contention.

Waiting time for cache

Indicates the time the cache kept waiting for a thread to become available, so that requests to the cache can be serviced.

Seconds

A high value for this measure is a cause for concern, as it implies that there are not enough free threads to service cache requests. In such a situation, you may want to compare the value of the Active threads in fsync, Active threads in read, and Active threads in write measures to know where most threads are stuck. If these measures do not report abnormal values, then check the value of the he Time taken for evicting threads measure. If this measure reports an abnormally high value, then cache eviction could be the bottleneck.