K8s API Server Connectivity Test

The API server is the component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane. All communication paths from the cluster to the master terminate at the API server. The cluster receives Object specs from users via the API server on the master node. While the master uses the API server to forward the Object specs to the scheduler and to the kubelets, the kubelets also update the master with the Object status via the API server. Instructions for creating, starting, destroying objects on a worker node are also sent to worker nodes via the API server only.

This implies that the non-availability of the API server can bring the entire cluster to a standstill! Administrators may no longer be able to stop, update, or start new pods, services, or the replication controller. Moreover, users will be denied access to the cluster, and consequently, to the business-critical applications/services running within! To avoid this, administrators must periodically check if the API server is available and promptly detect its unavailability. This is exactly what the API Server Connectivity test does!

At configured intervals, this test checks whether/not the API server is available, and instantly alerts administrators if it is not. In addition, this test also reports the responsiveness of the API server. This way, the test urges administrators to investigate the reason for the non-availability and fix it, so that cluster operations resume quickly.

Target of the test : A Kubernetes/OpenShift Cluster

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the Kubernetes/OpenShift cluster being monitored

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the host for which this test is to be configured.

Port

Specify the port at which the specified Host listens. By default, this is 6443.

Load Balancer / Master Node IP

To run this test and report metrics, the eG agent needs to connect to the Kubernetes API on the master node and run API commands. To enable this connection, the eG agent has to be configured with either of the following:

  • If only a single master node exists in the cluster, then configure the eG agent with the IP address of the master node.
  • If the target cluster consists of more than one master node, then you need to configure the eG agent with the IP address of the load balancer that is managing the cluster. In this case, the load balancer will route the eG agent's connection request to any available master node in the cluster, thus enabling the agent to connect with the API server on that node, run API commands on it, and pull metrics.

By default, this parameter will display theLoad Balancer / Master Node IP that you configured when manually adding the Kubernetes cluster for monitoring, using the Kubernetes Cluster Preferences page in the eG admin interface (see Figure 3). The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise?

Whenever the eG agent runs this test, it uses the IP address that is displayed (by default) against this parameter to connect to the Kubernetes API. If there is any change in this IP address at a later point in time, then make sure that you update this parameter with it, by overriding its default setting.

SSL

By default, the Kubernetes cluster is SSL-enabled. This is why, the eG agent, by default, connects to the Kubernetes API via an HTTPS connection. Accordingly, this flag is set to Yes by default.

If the cluster is not SSL-enabled in your environment, then set this flag to No.

Authentication Token

The eG agent requires an authentication bearer token to access the Kubernetes API, run API commands on the cluster, and pull metrics of interest. The steps for generating this token have been detailed in How Does eG Enterprise Monitor a Kubernetes/OpenShift Cluster?

Typically, once you generate the token, you can associate that token with the target Kubernetes cluster, when manually adding that cluster for monitoring using the eG admin interface. The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise?

By default, this parameter will display the Authentication Token that you provided in the Kubernetes Cluster Preferences page of the eG admin interface, when manually adding the cluster for monitoring (see Figure 3).

Whenever the eG agent runs this test, it uses the token that is displayed (by default) against this parameter for accessing the API and pulling metrics. If for any reason, you generate a new authentication token for the target cluster at a later point in time, then make sure you update this parameter with the change. For that, copy the new token and paste it against this parameter.

Proxy Host

If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the IP address of the proxy server here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,

Proxy Port

If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the port number at which that proxy server listens here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,

Proxy Username, Proxy Password, Confirm Password

These parameters are applicable only if the eG agent uses a proxy server to connect to the Kubernetes/OpenShift cluster, and that proxy server requires authentication. In this case, provide a valid user name and password against the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box.

If no proxy server is used, or if the proxy server used does not require authentication, then the default setting - none - of these parameters, need not be changed.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Availability

Indicates whether/not the API server is available.

Percent

If the value of this measure is 0, it indicates that the API server is unavailable. The value 100 on the other hand indicates that the API server is available.

In the event of the non-availability of the API server, you can use the detailed diagnosis of this measure to figure out the reason for the non-available. You can also use the /var/log/kube-apiserver.log file on the master node to figure out what could have caused the failure of the server.

The common causes for the unavailability of the API server are as follows:

  • API server VM shutting down or crashing;
  • API server losing access to its backing storage

To ensure the high availability of the API server, you may want to consider the following courses of action:

  • Use IaaS provider’s automatic VM restarting feature for IaaS VMs;
  • Use IaaS providers reliable storage (e.g. GCE PD or AWS EBS volume) for VMs with apiserver+etcd

Response time

Indicates the time taken by API server to respond to the requests it receives.

Seconds

Response time being high denotes a problem. Poor response times may be due to the API server being overloaded or misconfigured.