Monitoring Kubernetes Master node
The eG agent periodically executes tests on the Kubernetes master node, collects the necessary statistics, and reports them to the eG manager. These tests are mapped to specific layers of the Kubernetes master node layer model (see Figure 1).
Figure 1 : The layer model of the Kubernetes Master node
Using the metrics reported, administrators can quickly find accurate answers for the following performance questions:
-
Is current utilization of CPU, memory and disk optimal on each node?
-
Is the memory utilization on the node optimal?
-
Are there any performance bottlenecks in CPU and memory resources?
-
Is the disk usage on master node nearing capacity?
-
Are there any disk I/O errors or performance issues on the master node?
-
Is Kubernetes API Server working optimally on the master node?
-
Is the Kubernetes API Server responding with low latency or encountering timeouts?.
-
Is the health of etcd server on master node fine? Are there any reason for concern?
-
Are there any issues with etc leader election or unavailability?
-
Are the kube-scheduler and kube container manager running without issues?
-
Is the number of active pods running on container manager too high?
-
Are there any failed or stuck pods in the system, including critical control plane components?
-
Are Kubernetes audit logs showing relevant information for system events or errors?
-
Are there any certificates expiry warnings for Kubernetes components?
-
Are the network connections between the master node and worker node stable?
-
Are network policies correctly applied and there are no connectivity issue between nodes?
-
Is there any abnormal network latency between the master node and other components like etcd or the API server?
-
Are the Kubernetes control plane components auto-scaling based on resource demand?
-
Are there any frequent pod crashes or restarts related to Kubernetes components on the master node.
-
Are there any known security vulnerabilities or misconfigurations in the master node setup?
These questions focus on the most critical areas of monitoring a Kubernetes master node, from resource usage and component health to security and network connectivity. By effectively monitoring Kubernetes Master nodes, organizations can enhance performance, security, and overall reliability of their applications.
Since the Operating System, and TCP layers have been elaborately discussed in Monitoring Unix and Windows Servers document, the tests mapped to the Network layer have been elaborately discussed in Monitoring Cisco Router document, the tests mapped to the JVM layer have already been discussed in Monitoring Java Applications document, the tests mapped to Pods, Containers, Node, Kubelet and Kube Proxy layers have been discussed in Monitoring Kubernetes Worker document, the sections to come will discuss the other layers in detail.