Mail Task Queue Test

Email messages waiting to be sent are queued in a mail queue and periodically flushed from Confluence once a minute. A Confluence administrator can also manually flush messages from the mail queue. If there is an error sending messages, the failed email messages are sent to an error queue from which you can either try to resend them or delete them.

If the number of emails in the queue keeps increasing with time, it could mean that mail delivery has failed or is taking too long. Similarly, if emails are not flushed out of the error queue quickly, it once again hints at issues with mail delivery or mail system configuration. To ensure that the mail system functions smoothly and efficiently, administrators should be able to detect such anomalies instantly, and take action against them promptly. For this purpose, it will be good practice for administrators to run the Mail Task Queue test, periodically.

This test tracks changes to the mail queue and error queue, and proactively alerts administrators if the size of these queues keeps growing consistently. This points administrators to mail delivery failures or slowness in delivery of mails to recipients. The test also alerts administrators if too many mails from the mail queue failed to be delivered and were hence placed in the error queue. This way, the test sheds light on issues with the Confluence mailing system, thereby enabling administrators to promptly resolve them.

Target of the test: Atlassian Confluence

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for the target Confluence server

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port number at which the specified host listens to

JMX remote port

Here, specify the port at which the JMX listens for requests from remote hosts. Ensure that you specify the same port that you configured in the catalina.bat file in the <<ATLASSIAN_CONFLUENCE_INSTALL_DIR>>\confluence\bin directory.

JNDIname

The JNDIname is a lookup name for connecting to the JMX connector. By default, this is jmxrmi. If you have resgistered the JMX connector in the RMI registry using a different lookup name, then you can change this default value to reflect the same. 

JMX Registry SSL

If you have registered the JMX connector in an SSL-enabled RMI registry , set this flag to Yes. By default, this is set to No.

User, Password, and Confirm password

If JMX requires authentication only (but no security), then ensure that the user and password parameters are configured with the credentials of a user with read-write access to JMX. Confirm the password by retyping it in the Confirm Password text box.

Timeout

Specify the duration (in seconds) for which this test should wait for a response from the target server. If there is no response from the target beyond the configured duration, the test will timeout. By default, this is set to 10 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Task queue size

Indicates the number of email messages queued for dispatch.

Number

If the value of this measure keeps increasing with time, it indicates that automatic flushing of emails from mail queue is not working. You may want to manually flush the messages from the mail queue in this case.

Another reason for a long mail queue could be delay/slowness in delivery of emails to intended recipients. There could be multiple causes for this to happen:

  • Mail Queue Services are scheduled to be running by default once per minute. But sometimes due to user preference, it can be altered. This change will have a direct impact on the time it will take for an email to reach a user. To resolve this issue, you may want to reset the frequency of operations of the Mail Queue Services to its default or to some other acceptable frequency.
  • There could be times where there is an error message that cannot be flushed out of the queue; Confluence application attempts to flush the mail again and again, but ends up causing other mail items to be delayed. In this case, you may want to attempt a manual flush of the mail queue, in an effort to identify the exact message that fails to be flushed. Then, reach out to https://support.atlassian.com for further assistance.
  • By default, Confluence applications services are running on either 2 or 4 QuartzWorker threads - this depends on the version of Confluence. If there is some obstruction or delay on the threads executing, the next service may not be executed despite having the services scheduled to run. This means that if there are other services that are scheduled to run at the same frequency as the mail service, then the mail service could be delayed if the services scheduled to run before it are delayed. Setting appropriate operational frequencies for the different services will resolve this problem.
  • Verify that you are able to reach the mail server by ping and check if the latency is high on delivery. If the latency is high, this will delay the JavaMail service used by Confluence applications to receive the response in a timely manner. Delays can also occur if the SMTP mail server is overloaded, and is therefore unable to process mails quickly. Consult the network administrator or mail server administrator to further troubleshoot on this.
  • Every time mail is attempted to be sent, it will perform a reverse DNS lookup for the Confluence server application hostname. If the DNS isn't reachable, Confluence application will have to wait for a timeout which can be a long period of time (20-40 seconds). Make sure your DNS configuration is correct to avoid this issue.
  • There could be subscriptions being sent to huge user groups or slow queries being run as part of the filters that are subscribed to. Check the filtersubscription table and slow searchrequest table for such anomalous subscriptions/queries.

Error queue size

Indicates the number of error mail messages in the error queue.

Number

Ideally, the value of this measure should be 0. A non-zero value implies that one/more mails in the mail queue could not be delivered and have hence been placed in the error queue. If the value of this measure keeps increasing with time, it indicates that emails in the error queue could not be delivered even after multiple retries. This is a cause for concern and will have to be looked into immediately.

Retry count

Indicates the number of times the delivery of emails in the error queue was retried.

Number

Ideally, the value of this measure should be low.

Is currently flushing

Indicates whether/not the mail queue is currently flushing.

 

The values that this measure can report and their corresponding numeric values are listed in the table below:

Measure Value

Yes

No

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether/not mail queue flushing is in progress. In the graph of this measure however, the same is indicated using the numeric equivalents.

Error percentage

Indicates what percentage of mails in the mail queue have been placed in the error queue.

Percent

Ideally, the value of this measure should be very less. A value close to 100% is a cause for concern, as it indicates that almost all of the messages in the mail queue could not be delivered, and were hence placed in the error queue. The reason for the delivery failures will have to be investigated and promptly fixed. You may want to do the following in this case:

  • Ensure that you have properly configured an SMTP Server. Send a Test Mail inside the SMTP Server configuration setup screen. Make a note of any error that is returned from the test.
  • Check your JIRA application log files and the application server log files for Out of Memory errors. Typically, the log file will show java.lang.OutOfMemoryError: Java heap space. This has been known to cause the service responsible for sending emails out to fail until your applications are restarted. You should further troubleshoot your memory issues.
  • Check and ensure the Mail Queue Service is installed. Click Admininstration > Services to inspect that the service exists, and is set at a reasonable interval. This interval controls how frequently the mail queue is processed. You can flush the mail queue to send out pending messages immediately to your mail server.
  • Inspect your Mail Queue under Administration > Mail Queue. See if you are given the option to Bypass currently sending mail. A stuck email or trackback ping can hold up the queue.
  • Check that your Base URL is set to a domain / IP which your SMTP server will accept. Example: Google apps accounts must have a matching base url to their Google Apps domain.
  • Enable additional logging in Administration > System > Troubleshooting and Support > Logging and Profiling by setting the following to DEBUG to see more robust logging about services running at the background.