System Details Test

This operating system-specific test relies on native measurement capabilities of the operating system to collect various metrics pertaining to the CPU and memory usage of a host system. The details of this test are as follows:

Target of the test : Any host system

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each host monitored

Configurable parameters for the test
  1. Test period - How often should the test be executed
  2. Host - The host for which the test is to be configured.
  3. Duration - This parameter is of significance only while monitoring Unix hosts, and indicates how frequently within the specified test period, the agent should poll the host for CPU usage statistics.
  4. summary – This attribute is applicable to multi-processor systems only. If the Yes option is selected, then the eG agent will report not only the CPU and memory utilization of each of the processors, but it will also report the summary (i.e., average) of the CPU and memory utilizations of the different processors. If the No option is selected, then the eG agent will report only the CPU usage of the individual processors.  
  5. useiostat – This parameter is of significance to Solaris platforms only. By default, the useiostat flag is set to No. This indicates that, by default, SystemTest reports the CPU utilization of every processor on the system being monitored, and also provides the average CPU utilization across the processors. However, if you want SystemTest to report only the average CPU utilization across processors and across user sessions, then set the useiostat flag to Yes. In such a case, the processor-wise breakup of CPU utilization will not be available.
  6. useps - This flag is applicable only for AIX LPARs. By default, on AIX LPARs, this test uses the tprof command to compute CPU usage. Accordingly, the useps flag is set to No by default. On some AIX LPARs however, the tprof command may not function properly (this is an AIX issue). While monitoring such AIX LPARs therefore, you can configure the test to use the ps command instead for metrics collection. To do so, set the useps flag to Yes.

    Note:

    Alternatively, you can set the AIXusePS flag in the [agent_settings] section of the eg_tests.ini file (in the <eg_install_dir>\manager\config directory) to yes (default: no) to enable the eG agent to use the ps command for CPU usage computations on AIX LPARs. If this global flag and the useps flag for a specific component are both set to no, then the test will use the default tprof command to compute CPU usage for AIX LPARs. If either of these flags is set to yes, then the ps command will perform the CPU usage computations for monitored AIX LPARs.  

    In some high-security environments, the tprof command may require some special privileges to execute on an AIX LPAR (eg., sudo may need to be used to run tprof). In such cases, you can prefix the tprof command with another command (like sudo) or the full path to a script that grants the required privileges to tprof. To achieve this, edit the eg_tests.ini file  (in the <eg_install_dir>\manager\config directory), and provide the prefix of your choice against the AixTprofPrefix parameter in the [agent_settings] section. Finally, save the file.  For instance, if you set the AixTprofPrefix parameter to sudo, then the eG agent will call the tprof command as sudo tprof.   

  7. include wait - This flag is applicable to Unix hosts alone. On Unix hosts, CPU time is also consumed when I/O waits occur on the host. By default, on Unix hosts, this test does not consider the CPU utilized by I/O waits while calculating the value of the CPU utilization measure. Accordingly, the include wait flag is set to No by default. To make sure that the CPU utilized by I/O waits is also included in CPU usage computations on Unix hosts, set this flag to Yes
  8. To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

CPU utilization:

This measurement indicates the percentage of utilization of the CPU time of the host system.

Percent

A high value could signify a CPU bottleneck. The CPU utilization may be high because a few processes are consuming a lot of CPU, or because there are too many processes contending for a limited resource. Check the currently running processes to see the exact cause of the problem.

System CPU utilization:

Indicates the percentage of CPU time spent for system-level processing.

Percent

An unusually high value indicates a problem and may be due to too many system-level tasks executing simultaneously.

Run queue length:

Indicates the instantaneous length of the queue in which threads are waiting for the processor cycle. This length does not include the threads that are currently being executed.

Number

A value consistently greater than 2 indicates that many processes could be simultaneously contending for the processor.

Blocked processes:

Indicates the number of processes blocked for I/O, paging, etc.

Number

A high value could indicate an I/O problem on the host (e.g., a slow disk).

Swap memory:

On Windows systems, this measurement denotes the committed amount of virtual memory. This corresponds to the space reserved for virtual memory on disk paging file(s). On Solaris systems, this metric corresponds to the swap space currently available. On HPUX and AIX systems, this metric corresponds to the amount of active virtual memory (it is assumed that one virtual page corresponds to 4 KB of memory in this computation).

MB

An unusually high value for the swap usage can indicate a memory bottleneck. Check the memory utilization of individual processes to figure out the process(es) that has (have) maximum memory consumption and look to tune their memory usages and allocations accordingly.

Free memory:

Indicates the amount of memory (including standby and free memory) that is immediately available for use by processes, drivers or Operating System.

MB

This measure typically indicates the amount of memory available for use by applications running on the target host.

On Unix operating systems (AIX and Linux), the operating system tends to use parts of the available memory for caching files, objects, etc. When applications require additional memory, this is released from the operating system cache. Hence, to understand the true free memory that is available to applications, the eG agent reports the sum of the free physical memory and the operating system cache memory size as the value of the Free memory measure while monitoring AIX and Linux operating systems.   

Note:

For multi-processor systems, where the CPU statistics are reported for each processor on the system, the statistics that are system-specific (e.g., run queue length, free memory, etc.) are only reported for the "Summary" descriptor of this test.