What is Kubernetes?

Kubernetes is an open-source system for managing - i.e., running and co-ordinating - containerized applications across a cluster of machines. It allows users to define how their applications should run and how they should interact with other applications or the outside world. Using Kubernetes, users can ensure high-availability of their containerized applications, scale their services up or down, perform graceful rolling updates, and switch traffic between different versions of applications to test features or rollback problematic deployments.

Now, if the kubelet on the worker node fails to create a desired object - say, a Pod - then the desired state of the cluster will not be restored. Likewise, if a Pod running a critical application/service suddenly goes down, and the kubelet fails to restart that Pod or create another one in its stead, then again the actual state will not be in sync with the desired state. Under such circumstances, containerized applications and services may be rendered unavailable to end-users. Since Kubernetes is widely used in mission-critical environments - eg.,microservices, DevOps, serverless computing, and multi-cloud environments - for processing business-critical workloads, the non-availability of applications can adversely impact productivity and business continuity. To avoid this, administrators must closely monitor the status of the objects managed and operations performed by Kubernetes, proactively capture abnormalities, and resolve them well before end-users notice. This is where eG Enterprise helps!

eG Enterprise provides a dedicated monitoring model for those Kubernetes clusters that manage Docker hosts and containers.


eG Enterprise provides monitoring support to Kubernetes on Linux platforms only, and not on Windows.

This model continuously monitors the status of the cluster nodes, the Kubernetes control plane services running on the master node, and the workloads and application services on the worker nodes. In the process, eG promptly detects and alerts administrators to real/potential operational failures that may cause a mismatch between the actual state of objects and the desired cluster state. Rapid problem detection enables swift problem resolution, which in turn ensures the high availability of business-critical applications/services running within the containers in the cluster.