Kube Node Details Test

A node is a worker machine in Kubernetes. A node may be a VM or physical machine, depending on the cluster. Each node contains the services necessary to run pods and is managed by the master components. The services on a node include the container runtime, kubelet and kube-proxy.

The Kube Node Details test continuously monitors the target worker node, reports its type and availability of node's network and proactively alerts administrators when the node is not running. This test also reveals the number of pods that are currently running on the target node. Additionally, administrators can easily determine whether the node contains adequate free disk space for adding new pods and sufficient memory for pod operations. This enables the administrators to free up or allocate more disk space and memory if the disk space and memory are determined to be insufficient. In addition, this test also helps administrators to instantly find out if the node is unhealthy and not ready to accept more pods.

Target of the test : A Kubernetes Worker Node

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the target Kubernetes Worker node being monitored.

Configurable parameters for the test

Parameter

Description

Test Period

How often should the test be executed.

Host

The IP address of the host for which this test is to be configured.

Port

Specify the port at which the specified Host listens. By default, this is 6443.

Timeout

Specify the duration (in seconds) beyond which the test will timeout in the Timeout text box. The default value is 10 seconds.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Node type

Indicates the type of the target node.

 

A node in a cluster can be a Master node or a Worker node. The cluster contains at least one worker node and at least one master node. The worker node(s) host the pods that are the components of the application. The master node(s) manages the worker nodes and the pods in the cluster. Multiple master nodes are used to provide a cluster with failover and high availability.

If a node is the master node in a cluster, then this measure will report the value Master. For a worker node, this measure will report the value Worker.

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
Master 1
Worker 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate the type of the target node. In the graph of this measure however, the state is indicated using the numeric equivalents only.

Status

Indicates whether/not the node is running.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
Not running 0
Running 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate the current state of the node. In the graph of this measure however, the state is indicated using the numeric equivalents only.

The detailed diagnosis of this measure reveals the IP address and name of the node, the name of the cluster to which the node belongs to, the version of kubelet and container runtime available in the node.

Time since node has been created

Indicates how old the target node is.

 

The value of this measure is expressed in number of days, hours, and minutes.

Pods capacity

Indicates the maximum number of pods that can be scheduled on the target node.

Number

 

Pods running

Indicates the number of pods that are currently running on the target node.

Number

If the value of this measure for a node is equal to or is growing closer to the value of the Pods capacity measure, it indicates that node is about to exhaust its pod capacity.

Is the network of the node unavailable?

Indicates whether/not the network of the node is correctly configured.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate the availability of a node's network. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Is the node out of disk?

Indicates whether/not there is insufficient free disk space on the node for adding new pods.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate whether/not a node has run out of disk space. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Is the node under memory pressure?

Indicates whether/not the node is running low on memory.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate whether/not a node has sufficient memory. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Does the node have disk pressure?

Indicates whether/not the node's disk capacity is low.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate whether/not a node is low on disk capacity. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Is the node under PID pressure?

Indicates whether/not too many processes are running on the node.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate whether/not a node is under PID pressure. In the graph of this measure however, the same is indicated using the numeric equivalents only.

If this measure reports the value Yes for a node - i.e., if too many processes are running on a node - then you can use the detailed diagnosis of this measure to figure out the reason for the anomaly.

Is the node ready?

Indicates whether/not a node is healthy and ready to accept pods.

 

This measure reports the value Yes, if a node is healthy and is ready to accept Pods. The value No is reported if a node is not healthy and is not accepting Pods. The value Unknown is reported if the node controller has not heard from the node in the last node-monitor-grace-period (default is 40 seconds).

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
No 0
Yes 1
Unknown 2

Note:

By default, this test reports the Measure Values listed in the table above to indicate whether/not a node is ready. In the graph of this measure however, the same is indicated using the numeric equivalents only.

The detailed diagnosis of the Node type measure reveals the IP address and name of the node, the name of the cluster to which the node belongs to, the version of kubelet and container runtime available in the node.

Figure 1 : The detailed diagnosis of the Node type measure