ESE Database Engine Statistics Test

If a user experiences slow downs when accessing his/her Exchange mailbox, nine out of 10 times the reason would be the underlying ESE database. Various factors affect database performance. An under-sized / poorly configured database cache, latency in I/O processing, delay in reading/replaying log files, a session overload, can all cause an ESE database’s performance to degrade. To ensure that user productivity is not impacted by poor database performance, administrators should closely monitor each aspect of the performance of every ESE database on the Exchange server, capture anomalies before users notice, and resolve them before they affect the user experience with his/her mailbox.  This is where the ESE Database Engine Statistics test helps. This test auto-discovers the ESE databases on the Mailbox server, and for each database so discovered, reports the following:

  • How effectively is the database using its cache?  
  • Is the cache sized right?
  • How quickly is the database processing read/write requests to it?
  • Are log files read and replayed rapidly?
  • Are too many sessions to the database idle? 

Using the insights provided by this test, administrators can accurately identify slow/overloaded databases, pinpoint why the database performance is bottlenecked, and clear the bottleneck well before users notice and complain.

Target of the test : A Microsoft Exchange 2013/2016 server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for the each ESE database on the Exchange 2013/2016 server being monitored

Configurable parameters for the test
  1. Test period - How often should the test be executed
  2. Host - The host for which the test is to be configured.
  3. port - The port at which the host listens.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Pages in database cache being compressed:

Indicates the percentage of pages in the cache of this database that are currently compressed.

Percent

In order to conserve the memory resources of the Exchange Mailbox server, the ‘Back Pressure’ feature of the server automatically performs ‘message dehydration’ if the server’s memory usage crosses a configured threshold (by default, this is 94%). Message dehydration is the act of removing unnecessary elements of queued messages – i.e., compressing messages - that are cached in memory. Typically, complete messages are cached in memory for enhanced performance. Removal of the MIME content of queued messages from memory reduces the memory that is used.

A high value for this measure therefore indicates that messages in cache have been compressed to a large extent, thus freeing up a significant amount of system memory.  On the other hand, a high value for this measure can also increase the latency of mailbox operations, because, in the absence of complete messages in the cache, the server will be forced to read the messages directly from the message queue database.

To avoid this, the mailbox server has to be sized with adequate memory and the memory usage threshold of the Backprssure feature should also be set high enough.

Page requests fulfilled by database cache:

Indicates the percentage of page requests to this database that were fulfilled by the database cache without causing a file operation. 

Percent

If this percentage is too low, it could be because the database cache size may be too small. You may want to consider increasing the cache size, so as to reduce the processing overheads that may be incurred on account of direct database accesses. 

 

Database cache size:

Indicates the amount of system memory, in megabytes, used by the database cache manager to hold commonly used information from the database file(s) to prevent file operations.

MB

If the database cache size seems to be too small for optimal performance, and there is very little available memory on the system, an increase of memory in the system may increase performance. If there is a large amount of available memory on the system, and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. An increase in this limit may increase performance.

Effective database cache size:

Indicates the amount of system memory, in megabytes, that, hypothetically, would be used by this database cache manager if all used dehydrated/compressed database cache buffers were rehydrated/uncompressed.

MB

 

Resident database cache size:

Indicates the amount of system memory (in megabytes) used by this database cache that is currently part of the working set of the process. 

MB

If the value of this measure is ever significantly smaller than that of the Database cache size measure then the operating system has chosen to reclaim that system memory for use in other parts of the system.  The database cache will recover from this event but if this is a common occurrence then it can lead to significant performance problems.

Database cache page faults rate:

Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache.

Stalls/Sec

If the value of this measure is nonzero most of the time, the clean threshold may be too low.

 

Average read operation latency per database:

Indicates the average time taken by this database to perform read operations.

Secs

A low value is desired for this measure. A consistent increase in this value is indicative of a bottleneck when reading from the database. Compare the value of this measure across databases to know which database is the slowest when reading and why.

Rate of database read operations completed:

Indicates the rate at which this database completes read operations.

Reads/Sec

A consistent drop in this value can indicate a reduction in the number of read requests. It can also indicate a reading bottleneck. In the case of the latter, compare the value of this measure across databases to know which database is the slowest when processing read requests.

Average write operation latency per database:

Indicates the average time taken by this database to perform write operations.

Secs

A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the contents of the transaction log file are not written to the database file quickly, transaction recovery will be delayed.

Rate of database write operations completed:

Indicates the rate at which this database completes write operations.

Writes/Sec

A consistent drop in this value can indicate a reduction in the number of write requests. It can also indicate a writing bottleneck. In the case of the latter, compare the value of this measure across databases to know which database is the slowest when processing write requests.

Average read operation latency per log file:

Indicates the average time taken by this database to read a log file.

Secs

A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the transaction log file is read slowly, then it will delay the replay of the unwritten transactions and the writing of the transactions to the database file. This in turn will significantly slow down transaction recovery.

Rate of log file read operations completed:

Indicates the rate at which this database completes log file read operations.

Reads/Sec

A consistent drop in the value of this measure is indicative of a slowdown when reading from a log file.

Average write operation latency per log file:

Indicates the average time taken by this database to write from a log file.

Secs

A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the contents of the transaction log file are not written to the database file quickly, transaction recovery will be delayed.

Rate of log file write operations completed:

Indicates the rate at which this database completes log file write operations.

Reads/Sec

A consistent drop in the value of this measure is indicative of a slowdown when writing from a log file.

Database sessions used by client threads:

Indicates the percentage of database sessions currently open for use by client threads.

Percent

A high value is desired for this measure. A very low value could indicate that too many open sessions are idle.

Database sessions in use:

Indicates the number of database sessions currently open for use by client threads.

Number