K8s Cluster Health Monitor Test
K8s Cluster Health Monitor test helps the administrators to ensure that all cluster components are healthy, workloads are performing well, resources are adequately allocated, and the cluster is secure. Continuous monitoring and timely alerting helps them to detect issues early, optimize resources, prevent downtime, and maintain a reliable, high-performing AKS environment.
Target of the test : Azure Kubernetes Service Cluster
Agent deploying the test : A remote agent
Outputs of the test : One set of results for Azure Kubernetes Service Cluster being monitored.
| Parameter | Description |
|---|---|
|
Test Period |
How often should the test be executed. |
|
Host |
The IP address of the host for which this test is to be configured. |
|
Port |
Specify the port at which the specified Host listens. By default, this is 6443. |
|
Load Balancer / Master Node IP |
To run this test and report metrics, the eG agent needs to connect to the Kubernetes API on the master node and run API commands. To enable this connection, the eG agent has to be configured with either of the following:
By default, this parameter will display the Load Balancer / Master Node IP that you configured when manually adding the Kubernetes/OpenShift cluster for monitoring, using the Kubernetes Cluster Preferences page in the eG admin interface (see Figure 3). The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise? Whenever the eG agent runs this test, it uses the IP address that is displayed (by default) against this parameter to connect to the Kubernetes API. If there is any change in this IP address at a later point in time, then make sure that you update this parameter with it, by overriding its default setting. |
|
SSL |
By default, the Kubernetes/OpenShift cluster is SSL-enabled. This is why, the eG agent, by default, connects to the Kubernetes API via an HTTPS connection. Accordingly, this flag is set to Yes by default. If the cluster is not SSL-enabled in your environment, then set this flag to No. |
|
K8s Cluster API Prefix |
By default, this parameter is set to none. Do not disturb this setting if you are monitoring a Kubernetes/OpenShift Cluster. To run this test and report metrics for Rancher clusters, the eG agent needs to connect to the Kubernetes API on the master node of the Rancher cluster and run API commands. The Kubernetes API of Rancher clusters is of the default format: http(s)://{IP Address of kubernetes}/{api endpoints}. The Server section of the kubeconfig.yaml file downloaded from the Rancher console helps in identifying the Kubernetes API of the cluster. For e.g., https://{IP address of Kubernetes}/k8s/clusters/c-m-bznxvg4w/ is usually the URL of the Kubernetes API of a Rancher cluster. For the eG agent to connect to the master node of a Rancher cluster and pull out metrics, the eG agent should be made aware of the API endpoints in the Kubernetes API of the Rancher cluster. To aid this, you can specify the API endpoints available in the Kubernetes API of the Rancher cluster against this parameter. In our example, this parameter can be specified as: /k8s/clusters/c-m-bznxvg4w/. |
|
Authentication Token |
The eG agent requires an authentication bearer token to access the Kubernetes API, run API commands on the cluster, and pull metrics of interest. The steps for generating this token have been detailed in How Does eG Enterprise Monitor a Kubernetes/OpenShift Cluster? Typically, once you generate the token, you can associate that token with the target Kubernetes/OpenShift cluster, when manually adding that cluster for monitoring using the eG admin interface. The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise? By default, this parameter will display the Authentication Token that you provided in the Kubernetes Cluster Preferences page of the eG admin interface, when manually adding the cluster for monitoring (see Figure 3). Whenever the eG agent runs this test, it uses the token that is displayed (by default) against this parameter for accessing the API and pulling metrics. If for any reason, you generate a new authentication token for the target cluster at a later point in time, then make sure you update this parameter with the change. For that, copy the new token and paste it against this parameter. |
|
Namespace to Monitor |
To enable the eG agent to monitor a specific Namespace on Kubernetes/OpenShift cluster, specify the name of that Namespace against this parameter. For instance, eshop. Doing so will enable the eG agent to monitor and report metrics specific to this Namespace. |
|
Proxy Host |
If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the IP address of the proxy server here. If no proxy is used, then the default setting -none - of this parameter, need not be changed, |
|
Proxy Port |
If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the port number at which that proxy server listens here. If no proxy is used, then the default setting -none - of this parameter, need not be changed, |
|
Proxy Username, Proxy Password, Confirm Password |
These parameters are applicable only if the eG agent uses a proxy server to connect to the Kubernetes/OpenShift cluster, and that proxy server requires authentication. In this case, provide a valid user name and password against the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box. If no proxy server is used, or if the proxy server used does not require authentication, then the default setting - none - of these parameters, need not be changed. |
|
Kubernetes version |
The Version text box indicates the version of the Kubernetes/OpenShift cluster to be managed. The default value is none. If the value of this parameter is not "none", the test uses the value provided (e.g., 28.1) as the Kubernetes version. |
|
Timeout |
Specify the duration (in seconds) for which this test should wait for a response from the Kubernetes/OpenShift cluster. If there is no response from the cluster beyond the configured duration, the test will timeout. By default, this is set to 5 seconds. |
| Measurement | Description | Measurement Unit | Interpretation |
|---|---|---|---|
|
Cluster health |
Indicates the percentage of health of AKS clusters. |
Percent |
A value close to 100 is required for a healthy AKS cluster environment. |