Docker Containers - Performance Test

This test monitors each container available in Docker and reports the CPU utilization, I/O processing, memory related statistics such as memory utilization, paging in/paging outs, errors that were detected etc. Using this test, administrators can easily figure out processing/memory bottlenecks and rectify the same before the users complain of slow responsiveness of the containers.

Note:

  • The Docker server should be of v1.5 or above.
  • Remote REST API should be enabled on the Docker host. To know how to enable remote REST API, follow the procedure discussed in How does eG Enterprise Monitor Docker?
  • The eG agent should be 'allowed' to make remote REST API calls to pull metrics. For this purpose, make sure you configure this test with the credentials of a user who has permissions to connect to REST API and invoke its methods.
  • Make sure that the WEBPORT parameter of this test is configured with the port on which remote REST API has been enabled.

Target of the test : A Docker server

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each container available in the Docker server being monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the host for which this test is to be configured.

Port

The port number at which the specified HOST listens. The default is 2375.

UseSUDO

This flag is not applicable to this test.

Docker User and Docker Password

Specify the credentials of the user who is authorized to access the remote REST API and invoke its methods for metrics collection.

Confirm Password

Confirm the Password by retyping it here.

Webport

By default, the remote REST API is enabled on port 2375. This implies that by default, this test connects to port 2375 to access the remote REST API and make API calls for metrics collection. In some environments however, the remote REST API can be enabled on a different port. To know how to enable the remote REST API on a different port, follow the procedure discussed in How does eG Enterprise Monitor Docker?

Make sure you configure this parameter with the exact port on which the remote REST API has been enabled. To know which port number that is, do the following:

  • Open the /lib/system/system/docker.service file on the Docker host.
  • In the file, find the line which starts with ExecStart. In that line, look for the following entry:

    -H=tcp://0.0.0.0:<Remote_API_Port>

    The number that appears after ':' in the entry above, is the remote REST API port.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enabled/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Disk space usage

Indicates the amount of disk space utilized by this container.

MB

 

Data received rate

Indicates the rate at which the data was received by this container.

Mbps

A sudden increase or decrease in the value of this measure could be a cause of concern.

Incoming traffic

Indicates the rate at which packets were received by this container.

Pkts/sec

A significant increase or decrease in the value of this measure may alter traffic condition in the Docker server.

Errors received

Indicates the number of errors occurred while data was received by this container.

Number

Ideally, the value of this measure should be zero.

Packets dropped during reception

Indicates the number of packets dropped by this container during reception.

Number

Ideally, the value of this measure should be zero.

Data transmit rate

Indicates the rate at which the data was transmitted by this container.

Mbps

 

Outgoing traffic

Indicates the rate at which packets were transmitted by this container.

Pkts/sec

 

Errors transmitted

Indicates the number of errors occurred while data was transmitted by this container.

Number

Ideally, the value of this measure should be zero.

Packets dropped during transmission

Indicates the number of packets dropped by this container during transmission.

Number

Ideally, the value of this measure should be zero.

CPU utilization

Indicates the percentage of CPU that is currently utilized by this container.

Percent

The detailed diagnosis of this measure lists the processes that are consuming the CPU utilized by the container.

Comparing the value of this measure across the containers will enable you to accurately identify the container on which CPU-intensive applications are executing.

CPU utilization in kernelmode

Indicates the percentage of CPU utilized by this container in kernel mode.

Percent

A processor in a server has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode.

In Kernel mode, the executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address. Kernel mode is generally reserved for the lowest-level, most trusted functions of the operating system. Crashes in kernel mode are catastrophic; hence they will halt the entire container.

A high value for this measure indicates that the container is taking too much of CPU resources to execute processes in kernel mode.

CPU utilization in usermode

Indicates the percentage of CPU utilized by this container in user mode.

Percent

In User mode, the executing code has no ability to directly access hardware or reference memory. Code running in user mode must delegate to system APIs to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always recoverable. Most of the code running on the containers will execute in user mode. User mode processes communicate and use the kernel through the Kernel API and system calls.

Memory used

Indicates the amount of memory that is currently utilized by this container.

MB

 

Memory limit

Indicates the maximum amount of memory that is allocated to this container.

MB

 

Memory utilization

Indicates the percentage of memory utilized by this container.

Percent

The detailed diagnosis of this measure lists the processes that are consuming the memory utilized by the container.

A high value for this measure indicates that the memory resources of the container is depleting alarmingly.

Memory max usage

Indicates the maximum amount of memory utilized by this container.

MB

 

Active anonymous memory

Indicates the amount of anonymous memory that has been identified as active by the kernel.

MB

Anonymous memory is the large, zero-filled block of memory that is directly mapped by kernel from the anonymous memory region when large memory with ideally multiples of page sizes is required for the containers. The pages of anonymous memory do not link to any file on the disk, and are the part of program's data area or stack.

All anonymous pages are initially active. In that, some pages will be tagged as inactive when the kernel sweeps over the memory at regular intervals. Whenever the inactive pages are accessed, they are immediately retagged as active. The inactive pages will be swapped when the kernel is almost out of memory, and time comes to swap out to disk.

Inactive anonymous memory

Indicates the amount of anonymous memory that has been identified as inactive by the kernel.

MB

Cache memory

Indicates the amount of memory used by processes of the control group that can be associated precisely with a storage block on this container.

MB

When you read from and write to files on the disk, the amount of cache memory will increase. Size of the cache memory depends on the number of read operations and write operations performed on this container.

Active file

Indicates the amount of cache memory that has been identified as active by the kernel.

MB

Pages in the cache memory can be swapped between active and inactive states similar to the anonymous memory but the exact rules used by the kernel to move memory pages between active and inactive sets are different from the rules used for the anonymous memory. The cache memory pages can be immediately retrieved in a cheaper way when the kernel needs to reclaim memory while the anonymous pages and dirty/modified pages have to be written to disk first before retrieving process.

Inactive file

Indicates the amount of cache memory that has been identified as inactive by the kernel.

MB

Memory mapped

Indicates the amount of memory mapped by the processes in the control group.

MB

Docker on Linux also makes use of another technology called cgroups or control groups. A key to running applications in isolation is to have them only use the resources you want. This ensures containers are good multi-tenant citizens on a host. Control groups allow Docker to share available hardware resources to containers and, if required, set up limits and constraints. For example, limiting the memory available to a specific container.

Page faults

Indicates the number of times a process in this container triggered a page fault.

Number

A page fault occurs when a process accesses a part of its virtual memory space which is nonexistent or protected. The former can happen if the process is buggy and tries to access an invalid address (it will then be sent a SIGSEGV signal, typically killing it with the famous Segmentation fault message). The latter can happen when the process reads from a memory zone which has been swapped out, or which corresponds to a mapped file: in that case, the kernel will load the page from disk, and let the CPU complete the memory access. It can also happen when the process writes to a copy-on-write memory zone: likewise, the kernel will preempt the process, duplicate the memory page, and resume the write operation on the process` own copy of the page.

The major page faults are regular faults occur when the kernel actually has to read the data from disk, duplicate an existing page, or allocate an empty page.

Page major faults

Indicates the number of times a process in this container triggered a major page fault.

Number

Page ins

Indicates the number of pages that were added to the control group of this container.

Number

 

Page outs

Indicates the number of pages that were not billed to the control group of this container.

Number

 

Resident set size

Indicates the amount of memory that doesn't correspond to anything on disk such as stacks and heaps, and anonymous memory maps of this container.

MB

 

Huge resident set size

Indicates the amount of maximum memory that doesn't correspond to anything on disk such as stacks and heaps, and anonymous memory maps of this container.

MB

 

Swap memory

Indicates the amount of swap memory that is currently utilized by the processes in the control group of this container.

MB

An unusually high value for the swap memory can indicate a memory bottleneck.

Memory unevictable

Indicates the amount of memory that cannot be reclaimed by this container.

MB

Generally, the unevictable memory has been locked with mlock, and is often used by crypto frameworks to make sure that secret keys and other sensitive material never gets swapped out to disk.

Writeback

Indicates the amount of memory that was written back in this container.

MB

A high value is preferred for this measure. Normally, the data are first written into the cache before writing it into the memory or disk that supports caching. During idle machine cycles, the data are written from the cache into the memory or onto the disk at high speed. In this way, the Write back caches improve performance and writing speed than the normal RAM or disk.

Failures

Indicates the number of memory failures that occurred in this container.

Number

Ideally, the value of this measure should be low.

Data reads

Indicates the rate at which the data was read from this container.

Mbps

Compare the values of these measures across the containers to identify the slowest container in terms of processing read and write operations (respectively).

Data writes

Indicates the rate at which the data were written into this container.

Mbps

Data sync

Indicates the rate at which the synchronous I/O operations were performed on this container.

Mbps

In synchronous I/O file, a thread starts a I/O operation and immediately enters a wait state until the I/O request has completed such that the I/O operations of this container are performed in one-by-one manner so as to prevent unwanted data traffic.

Data async

Indicates the rate at which the asynchronous I/O operations were performed on this container.

Mbps

A thread performing asynchronous file I/O sends an I/O request to the kernel by calling an appropriate function. If the request is accepted by the kernel, the calling thread continues processing another job until the kernel signals to the thread that the I/O operation is complete. Therefore, speed of the I/O operations will be increased.

Total data

Indicates the rate at which the total I/O operations were performed on this container.

Mbps