Server Load Average Test

In UNIX computing, the system load is a measure of the amount of work that a computer system performs. The load average represents the average system load over a period of time. This test reports the average load of Unix systems by reporting three metrics, which represent the system load during the last one-, five-, and fifteen-minute periods.

Note:

This test executes only on Unix systems.

Target of the test : Any Unix host system

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each host monitored

Configurable parameters for the test
  1. Test period - How often should the test be executed
  2. Host - The host for which the test is to be configured.
  3. port - Refers to the port used by the specified host. By default, it is NULL.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Average load in the last 1 min:

Indicates the average number of processes waiting in the run-queue over the past 1 minute.

Number

For an idle computer, the value of these measures will be 0. Each process using or waiting for CPU (the ready queue or run queue) will increment these values by 1.

Most UNIX systems count only processes in the running (on CPU) or runnable (waiting for CPU) states. However, Linux also includes processes in uninterruptible sleep states (usually waiting for disk activity), which can lead to markedly different results if many processes remain blocked in I/O due to a busy or stalled I/O system. This, for example, includes processes blocking due to an NFS server failure or to slow media (e.g., USB 1.x storage devices). Such circumstances can result in significantly increasing the value of this measure, which may not reflect an actual increase in CPU use, but will still give an idea on how long users have to wait.

For single-CPU systems that are CPU-bound, one can think of load average as a percentage of system utilization during the respective time period. For systems with multiple CPUs, one must divide the number by the number of processors in order to get a comparable percentage.

For example, if these measures report the values 1.73, 0.50, and 7.98, respectively, on a single-CPU system, these values can be interpreted as follows:

  • during the last minute, the CPU was overloaded by 73% (1 CPU with 1.73 runnable processes, so that 0.73 processes had to wait for a turn)
  • during the last 5 minutes, the CPU was underloaded 50% (no processes had to wait for a turn)
  • during the last 15 minutes, the CPU was overloaded 698% (1 CPU with 7.98 runnable processes, so that 6.98 processes had to wait for a turn)

This means that this CPU could have handled all of the work scheduled for the last minute if it were 1.73 times as fast, or if there were two (the ceiling of 1.73) times as many CPUs, but that over the last five minutes it was twice as fast as necessary to prevent runnable processes from waiting their turn. In a system with four CPUs, a load average of 3.73 would indicate that there were, on average, 3.73 processes ready to run, and each one could be scheduled into a CPU.

Average load in the last 5 mins:

Indicates the average number of processes waiting in the run-queue over the past 5 minutes.

Number

Average load in the last 15 mins:

Indicates the average number of processes waiting in the run-queue over the past 15 minutes.

Number