Java Key Business Transactions Test

For any business-critical application, some transactions will always be considered key from the point of view of user experience and business impact. For instance, in the case of a retail banking web application, fund transfers executed online are critical transactions that have to be tracked closely for delays / errors, as problems in the transaction will cost both consumers and the company dearly. Using the Java Key Business Transactions test, administrators can perform focused monitoring of such critical transactions alone.

For each transaction URL pattern configured for monitoring on a JVM node, this test reports the count of requests for that transaction pattern, and the count and percentage of transactions of that pattern that were slow / stalling / error-prone. Detailed diagnostics provided by the test highlight the slow / stalled / error transactions of a pattern, and pinpoint the precise reason why that key transaction slowed down / stalled / encountered errors - is it because of an inefficient database query? is it because of a processing bottleneck on the JVM node? or is it owing to slow remote service calls? This way, the test enables you to quickly detect inconsistencies in the performance of your critical business transactions and accurately isolate its root-cause, so that you can fix the issues well before users notice them.

Target of the Test : A BTM-enabled JVM

Agent deploying the test : An internal/remote agent

Output of the test : One set of results for each URL pattern configured for monitoring.

Test parameters:

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The host for which this test is to be configured.

BTM Port

Specify the port number specified as BTM_Port in the btmOther.props file on the JVM node being monitored. If the JVM is being monitored in an agent-based manner, then the btmOther.props file will be in the <EG_AGENT_INSTALL_DIR>\lib\bm directory.

URL Patterns

Provide a comma-separated list of PatternName:URLPattern pairs to be monitored. The PatternName can be any name that uniquely identifies the pattern. These PatternNames will be the descriptors of this test. For the URLPattern, you can either provide the exact URL to be monitored , or can provide a pattern. For instance, if you want to monitor requests to distinct and specific web pages - say, login.jsp and payment.jsp of a web application - then you can specify the exact URL of these web pages as your URL PATTERNS. In this case your specification will be,Login:/web/login.jsp,Payment:/web/payment.jsp. On the other hand, if you want to monitor requests to all payment-related web pages in a web application - say, payment.jsp, creditcardpayment.jsp, debitcardpayment.jsp, onlinepayment.jsp, and more - and you want the metrics to be grouped under a single head called Payment, then you can specify a pattern instead of the exact URL. In this case, your URL PATTERNS specification will be Payment:*payment*. The leading '*' in the specification signifies any number of leading characters, while the trailing '*' signifies any number of trailing characters. This means that the specification in our example will track requests to all pages with names that contain the word payment. Your URLPattern can also be *expr or expr* or *expr1*expr2* or expr1*expr2, etc.

Key Excluded Patterns

By default, this test does not track requests to the following URL patterns:

*.ttf, *.otf, *.woff, *.woff2, *.eot, *.cff, *.afm, *.lwfn, *.ffil, *.fon, *.pfm, *.pfb, *.std, *.pro, *.xsf, *.jpg, *.jpeg, *.jpe, *.jif, *.jfif, *.jfi, *.jp2, *.j2k, *.jpf, *.jpx, *.jpm, *.jxr, *.hdp, *.wdp, *.mj2, *.webp, *.gif, *.png, *.apng, *.mng, *.tiff, *.tif, *.xbm, *.bmp, *.dib, *.svg, *.svgz, *.mpg, *.mpeg, *.mpeg2, *.avi, *.wmv, *.mov, *.rm, *.ram, *.swf, *.flv, *.ogg, *.webm, *.mp4, *.ts, *.mid, *.midi, *.rm, *.ram, *.wma, *.aac, *.wav, *.ogg, *.mp3, *.mp4, *.css, *.js, *.ico, *.cur, /egurkha*

If required, you can remove one/more patterns from this default list, so that such patterns are monitored, or can append more patterns to this list in order to exclude them from monitoring.

Method Exec Cutoff (MS)

From the detailed diagnosis of slow/stalled/error transactions, you can drill down and perform deep execution analysis of a particular transaction. In this drill-down, the methods invoked by that slow/stalled/error transaction are listed in the order in which the transaction calls the methods. By configuring a Method Exec Cutoff (MS), you can make sure that methods that have been executing for a duration greater the specified cutoff are alone listed when performing execution analysis. For instance, if you specify 5 here, then the Execution Analysis window for a slow/stalled/error transaction will list only those methods that have been executing for over 5 milliseconds. This way, you get to focus on only those methods that could have caused the slowness, without being distracted by inconsequential methods. By default, the value of this parameter is set to 250 ms.

SQL Execution Cutoff (MS)

Typically, from the detailed diagnosis of a slow/stalled/error transaction on a JVM node, you can drill down to view the SQL queries (if any) executed by that transaction from that node and the execution time of each query. By configuring a SQL Execution Cutoff (MS), you can make sure that queries that have been executing for a duration greater the specified cutoff are alone listed when performing query analysis. For instance, if you specify 5 here, then for a slow/stalled/error transaction, the SQL Queries window will display only those queries that have been executing for over 5 milliseconds. This way, you get to focus on only those queries that could have contributed to the slowness. By default, the value of this parameter is set to 10 ms.

Healthy URL Trace

By default, this flag is set to No. This means that eG will not collect detailed diagnostics for those transactions that are healthy. If you want to enable the detailed diagnosis capability for healthy transactions as well, then set this flag to Yes.

Max Healthy URLs per Test Period

This parameter is applicable only if the Healthy URL Trace flag is set to ‘Yes’. Here, specify the number of top-n transactions that should be listed in the detailed diagnosis of the Healthy transactions measure, every time the test runs. By default, this is set to 50, indicating that the detailed diagnosis of the Healthy transactions measure will by default list the top-50 transactions, arranged in the descending order of their response times.

Max Slow URLs per Test Period

Specify the number of top-n transactions that should be listed in the detailed diagnosis of the Slow transactions measure, every time the test runs. By default, this is set to 10, indicating that the detailed diagnosis of the Slow transactions measure will by default list the top-10 transactions, arranged in the descending order of their response times.

Max Stalled URLs per Test Period

Specify the number of top-n transactions that should be listed in the detailed diagnosis of the Stalled transactions measure, every time the test runs. By default, this is set to 10, indicating that the detailed diagnosis of the Stalled transactions measure will by default list the top-10 transactions, arranged in the descending order of their response times.

Max Error URLs per Test Period

Specify the number of top-n transactions that should be listed in the detailed diagnosis of the Error transactions measure, every time the test runs. By default, this is set to 10, indicating that the detailed diagnosis of the Error transactions measure will by default list the top-10 transactions, in terms of the number of errors they encountered.

Show HTTP Status

If you want the detailed diagnosis of this test to report the HTTP response code that was returned when a transaction URL was hit, then set this flag to Yes. This will enable you to instantly identify HTTP errors that may have occurred when accessing a transaction URL. By default, this flag is set to No, indicating that the HTTP status code is not reported by default as part of detailed diagnostics.

Show Cookies

An HTTP cookie is a small piece of data sent from a website and stored on the user's computer by the user's web browser while the user is browsing. Most commonly, cookies are used to provide a way for users to record items they want to purchase as they navigate throughout a website (a virtual "shopping cart" or "shopping basket"). To keep track of which user is assigned to which shopping cart, the server sends a cookie to the client that contains a unique session identifier (typically, a long string of random letters and numbers). Because cookies are sent to the server with every request the client makes, that session identifier will be sent back to the server every time the user visits a new page on the website, which lets the server know which shopping cart to display to the user. Another popular use of cookies is for logging into websites. When the user visits a website's login page, the web server typically sends the client a cookie containing a unique session identifier. When the user successfully logs in, the server remembers that that particular session identifier has been authenticated, and grants the user access to its services. If you want to view and analyze the useful information that is stored in such HTTP response cookies that a web server sends, then set this flag to Yes. By default, this flag is set to No, indicating that cookie information is not reported by default as part of detailed diagnostics.

Show Headers

HTTP headers allow the client and the server to pass additional information with the request or the response. A request header is a header that contains more information about the resource to be fetched or about the client itself. If you want the additional information stored in a request header to be displayed as part of detailed diagnostics, then set this flag to Yes. By default, this flag is set to No indicating that request headers are not displayed by default in the detailed diagnosis.

Enable Thread CPU Monitoring

If this flag is set to Yes, then this test will additionally report the average time for which the transactions of a pattern were utilizing the CPU resources. This will point you to transaction patterns that are CPU-intensive, and will thus help you right-size your JVMs. By default however, this test will not report the average CPU time of transaction patterns. This is because, by default, the Enable Thread CPU Monitoring flag is set to No for this test.

Enable Thread Contention Monitoring

If this flag is set to Yes, then this test will additionally report the following:

  • The average time for which the transactions of a pattern were waiting, before they resumed execution;
  • The average time for which the transactions of a pattern were blocked from execution by another transaction;

If transactions of a pattern are found to be much slower than the rest or are stalling, then the aforesaid metrics will help administrators determine what could have caused the slowness - is it because the transactions were waiting for too long? or is it because they were being blocked for too long?

By default however, this test will not report the metrics described above, because the Enable Thread Contention Monitoring flag is set to No by default.

Advanced Settings

To optimize transaction performance and conserve space in the eG database, many restraints have been applied by default on the agent’s ability to collect and report detailed diagnostics. Depending upon how well-tuned your eG database is and the level of visibility you require into transaction performance, you may choose to either retain these default settings or override them. If you choose not to disturb the defaults, then set the Advanced Settings flag to No. If you want to modify the defaults, then set this flag to Yes.

POJO Method Tracing Limit and POJO
Method Tracing Cutoff Time

These parameters will appear only if the Advanced Settings flag is set to ‘Yes’. Typically, if the monitoring mode of this test is set to Profiler , then, as part of the detailed diagnostics of a transaction, eG reports the execution time of every POJO, non-POJO, and recursive (i.e. methods that call themselves) method call that a JVM node makes when processing that transaction. Of these, POJO method calls are the most expensive, as they are usually large in number. To ensure that attempts made to collect detailed measures related to POJO method calls do not impact the overall responsiveness of the monitored transaction, eG, by default, collects and reports the execution time of only the following POJO method calls:

  • The first 1000 POJO method calls made by the target JVM node for that transaction; (OR)
  • The POJO method calls that were made by the target JVM node within 10 seconds from the start of the monitored transaction on that node;

Accordingly, the POJO Method Tracing Limit is set to 1000 by default, and the POJO Method Tracing Cutoff Time is set to 10 (seconds) by default. Of these two limits, whichever limit is reached first will automatically be applied by eG for determining when to stop POJO tracing. In other words, once a JVM node starts processing a transaction, the agent begins tracking the POJO method calls made by that node for that transaction. In the process, if the agent finds that the configured tracing limit is reached before the tracing cutoff time is reached, then the agent will stop tracking the POJO method calls, as soon as the tracing limit is reached. On the other hand, if the tracing limit is not reached, then the agent will continue tracking the POJO method calls until the tracing cutoff time is reached. At the end of the cutoff time, the agent will stop tracking the POJO method calls. For instance, if the JVM node makes 1000 POJO method calls within say, 6 seconds from when it began processing the transaction, then the eG agent will not wait for the cutoff time of 10 seconds to be reached; instead, it will stop tracing at the end of the thousandth POJO method call, and report the execution time of each of the 1000 calls alone. On the other hand, if the JVM node does not make over 1000 POJO method calls till the 10 second cutoff expires, then the eG agent continues tracking the POJO method calls till the end of 10 seconds, and reports the details of all those that were calls made till the cutoff time.

Depending upon how many POJO calls you want to trace and how much overhead you want to impose on the agent and on the transaction, you can increase / decrease the POJO Method Tracing Limit and POJO Method Tracing Cutoff Time specifications.

Non-POJO Method Tracing Limit

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. By default, when reporting the detailed diagnosis of a transaction on a particular JVM node, this test reports the execution time of only the first 1000 non-POJO method calls (which includes JMS, JCO, HTTP, Java, SQL, etc.) that the target JVM node makes for that transaction. This is why, the non-pojo method tracing limit parameter is set to 1000 by default. If you want, you can change the tracing limit to enable the test to report the details of more or fewer non-POJO method calls made by a JVM node. While a high value for this parameter may take you closer to identifying the non-POJO method that could have caused the transaction to slowdown on a particular JVM node, it may also marginally increase the overheads of the transaction and the eG agent.

Recursive Method Tracing Limit

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. A recursive method is a method that calls itself. By default, when reporting the detailed diagnosis of a transaction on a particular JVM node, this test reports the execution time of only the first 1000 recursive method calls (which includes JMS, JCO, HTTP, Java, SQL, etc.) that the target JVM node makes for that transaction. This is why, the Recursive Method Tracing Limit parameter is set to 1000 by default. If you want, you can change the tracing limit to enable the test to report the details of more or fewer recursive method calls made by a JVM node. While a high value for this parameter may take you closer to identifying the recursive method that could have caused the transaction to slowdown on a particular JVM node, it may also marginally increase the overheads of the transaction and the eG agent.

Exception Stacktrace Lines

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. As part of detailed diagnostics, this test, by default, lists the first 10 stacktrace lines of each JavaScript error/exception that it captures on the target JVM node for a specific transaction, so as to enable easy and efficient troubleshooting. This is why, the Exception Stacktrace Lines parameter is set to 10 by default. If required, you can have this test display more or fewer stacktrace lines by overriding this default setting.

Included Exceptions

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. By default, this test flags the transactions in which the following errors/exceptions are captured, as Error transactions:

  • All unhandled exceptions;
  • Both handled and unhandled SQL exceptions/errors

This implies that if a programmatically-handled non-SQL exception occurs in a transaction, such a transaction, by default, will not be counted as an Error transaction by this test.

Sometimes however, administrators may want to be alerted even if some non-SQL exceptions that have already been handled programmatically, occur. This can be achieved by configuring a comma-separated list of these exceptions in the Included Exceptions text box. Here, each exception you want to include has to be defined using its fully qualified exception class name. For instance, your Included Exceptions specification can be as follows: java.lang.NullPointerException, java.lang.IndexOutOfBoundsException. Note that wild card characters cannot be used as part of your specification. Once the exceptions to be included are configured, then this test will count all transactions in which such exceptions are captured as Error transactions.

Ignored Exceptions

This parameter will appear only if the Advanced settings flag is set to ‘Yes’. By default, this test flags the transactions in which the following errors/exceptions are captured, as Error transactions:

  • All unhandled exceptions;
  • Both handled and unhandled SQL exceptions/errors

Sometimes however, administrators may want eG to disregard certain unhandled exceptions (or handled SQL exceptions), as they may not pose any threat to the stability of the transaction or to the web site/web application. To achieve this, administrators can configure a comma-separated list of such inconsequential exceptions in the Ignored Exceptions text box. Here, you need to configure each exception you want to exclude using its fully qualified exception class name. For instance, your Excluded Exceptions specification can be as follows: java.sql.SQLException,java.io.FileNotFoundException. Note that wild card characters cannot be used as part of your specification. Once the exceptions to be excluded are configured, then this test will exclude all transactions in which such exceptions are captured from its count of Error transactions.

Ignored Characters

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. By default, eG excludes all transaction URLs that contain the ‘\’ character from monitoring. If you want eG to ignore transaction URLs with any other special characters, then specify these characters as a comma-separated list in the Ignored Characters text box. For instance, your specification can be: \\,&,~

Max Grouped URLs per Measure Period

This parameter will appear only if the Advanced Settings flag is set to ‘Yes’. This test groups URLs according to the Max URL Segments specification. These grouped URLs will be the descriptors of the test. For each grouped URL, response time metrics will be aggregated across all transaction URLs in that group and reported.

When monitoring web sites/web applications to which the transaction volume is normally high, this test may report metrics for hundreds of descriptors. If all these descriptors are listed in the Layers tab page of the eG monitoring console, it will certainly clutter the display. To avoid this, by default, the test displays metrics for a maximum of 50 descriptors – i.e., 50 grouped URLs alone – in the eG monitoring console, during every measure period. This is why, the Max Grouped URLs per measure period parameter is set to 50 by default.

To determine which 50 grouped URLs should be displayed in the eG monitoring console, the eG BTM follows the below-mentioned logic:

  • Top priority is reserved for URL groups with error transactions. This means that eG BTM first scans URL groups for error transactions. If error transactions are found in 50 URL groups, then eG BTM computes the aggregated response time of each of the 50 groups, sorts the error groups in the descending order of their response time, and displays all these 50 groups alone as the descriptors of this test, in the sorted order.
  • On the other hand, if error transactions are found in only one / a few URL groups – say, only 20 URL groups – then, eG BTM will first arrange these 20 grouped URLs in the descending order of their response time. It will then compute the aggregated response time of the transactions in each of the other groups (i.e., the error-free groups) that were auto-discovered during the same measure period. These other groups are then arranged in the descending order of the aggregated response time of their transactions. Once this is done, eG BTM will then pick the top-30 grouped URLs from this sorted list.

    In this case, when displaying the descriptors of this test in the Layers tab page, the 20 error groups are first displayed (in the descending order of their response time), followed by the 30 ‘error-free’ groups (also in the descending order of their response time).

    At any given point in time, you can increase/decrease the maximum number of descriptors this test should support by modifying the value of the Max Grouped URLs per Measure Period parameter.

Max SQl Queries per Transaction

This parameter will appear only if the Advanced Settings flag is set to ‘true’. Typically, from the detailed diagnosis of a slow/stalled/error transaction on a JVM node, you can drill down to view the SQL queries (if any) executed by that transaction from that node and the execution time of each query. By default, eG picks the first 500 SQL queries executed by the transaction, compares the execution time of each query with the SQL Execution Cutoff configured for this test, and displays only those queries with an execution time that is higher than the configured cutoff. This is why, the Max SQL Queries per Transaction parameter is set to 500 by default.

To improve agent performance, you may want the SQL execution cutoff to be compared with the execution time of a less number of queries - say, 200 queries. Similary, to increase the probability of capturing more number of long-running queries, you may want the sql execution cutoff to be compared with the execution time of a large number of queries - say, 1000 queries. For this, you just need to modify the Max SQL Queries per Transaction specification to suit your purpose.

Timeout

By default, the eG agent will wait for 1000 milliseconds for a response from the eG Application Server agent. If no response is received, then the test will timeout. You can change this timeout value, if required.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measures reported by the test:
Measurement Description Measurement Unit Interpretation

All transactions

Indicates the total number of requests received for transactions of this pattern during the last measurement period.

Number

By comparing the value of this measure across transaction patterns, you can identify the most popular transaction patterns. Using the detailed diagnosis of this measure, you can then figure out which specific transactions of that pattern are most requested.

Avg response time

Indicates the average time taken by the transactions of this pattern to complete execution.

Secs

Compare the value of this measure across patterns to isolate the type of transactions that were taking too long to execute. You can then use the detailed diagnosis of the All transactions measure of that group to know how much time each transaction in that group took to execute. This will lead you to the slowest transaction.

Healthy transactions

Indicates the number of healthy transactions of this pattern.

Number

Healthy transactions percentage

Indicates what percentage of the total number of transactions of this pattern is healthy.

Percent

To know which are the healthy transactions, use the detailed diagnosis of this measure.

Slow transactions

Indicates the number of transactions of this pattern that were slow during the last measurement period.

Number

This measure will report the number of transactions with a response time higher than the configured Slow Transaction Cutoff (MS). A high value is a cause for concern, as too many slow transactions means that user experience with the web application is poor.

Slow transaction response time

Indicates the average time taken by the slow transactions of this pattern to execute.

Secs

Slow transactions percentage

Indicates what percentage of the total number of transactions of this pattern is currently slow.

Percent

Use the detailed diagnosis of this measure to know which precise transactions of a pattern are slow. You can drill down from a slow transaction to know what is causing the slowness.

Error transactions

Indicates the number of transactions of this pattern that experienced errors during the last measurement period.

Number

A high value is a cause for concern, as too many error transactions to a web application can significantly damage the user experience with that application.

Error transactions response time

Indicates the average duration for which the transactions of this pattern were processed before an error condition was detected.

Secs

The value of this measure will help you discern if error transactions were also slow.

Error transactions percentage

Indicates what percentage of the total number of transactions of this pattern is experiencing errors.

Percent

Use the detailed diagnosis of this measure to isolate the error transactions. You can even drill down from an error transaction in the detailed diagnosis to determine the cause of the error.

Stalled transactions

Indicates the number of transactions of this pattern that were stalled during the last measurement period.

Number

This measure will report the number of transactions with a response time higher than the configured Stalled Transaction Cutoff (MS). A high value is a cause for concern, as too many stalled transactions means that user experience with the web application is poor.

Stalled transactions response time:

Indicates the average time taken by the stalled transactions of this pattern to execute.

Secs

Stalled transactions percentage

Indicates what percentage of the total number of transactions of this pattern is stalling.

Percent

Use the detailed diagnosis of this measure to know which precise transactions of a pattern are stalled. You can drill down from a stalled transaction to know what is causing that transaction to stall.

Slow SQL statements executed

Indicates the number of slow SQL queries that were executed by the transactions of this pattern during the last measurement period.

Number

Slow SQL statement time

Indicates the average execution time of the slow SQL queries that were run by the transactions of this pattern.

Secs

If there are too many slow transactions of a pattern, you may want to check the value of this measure for that pattern to figure out if query execution is slowing down the transactions. Use the detailed diagnosis of the Slow transactions measure to identify the precise slow transaction. Then, drill down from that slow transaction to confirm whether/not database queries have contributed to the slowness. Deep-diving into the queries will reveal the slowest queries and their impact on the execution time of the transaction.

Avg CPU time

Indicates the average time for which transactions of this pattern were utilizing the CPU.

Msecs

Compare the value of this measure across transaction patterns to accurately identify the CPU-intensive transaction patterns.

Note:

This measure is reported only under the following circumstances:

Avg block time

Indicates the average duration for which transactions of this pattern were blocked and could not execute.

Msecs

If the Avg response time for any transaction pattern is very high, you may want to check the value of this measure for that pattern. This will help you figure out whether/not prolonged blocking is causing transactions of that pattern to slow down or stall.

Note:

This measure is reported only under the following circumstances:

Avg wait time

Indicates the average duration for which transactions of this pattern were waiting before they resumed execution.

Msecs

If the Avg response time for any transaction pattern is very high, you may want to check the value of this measure for that pattern. This will help you figure out whether/not a very high waiting time is what is causing the transactions to slow down/stall.

Note:

This measure is reported only under the following circumstances:

Total transactions per minute

Indicates the number of transactions of this pattern that are executed per minute.

Number

This is a good indicator of the transaction processing ability of the target application server.

Error transactions per minute

Indicates the number of error transactions of this pattern that are executed per minute.

Number

A very low value is desired for this measure.

Compare the value of this measure across transaction patterns to find that pattern of transactions that is experiencing errors frequently.