K8s Services By Namespace Test

In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). Services enable a loose coupling between dependent Pods.

A Service is required because, Pods are mortal - they are born, and they die. In a deployment therefore, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later. This leads to a problem: if some set of Pods (call them “backends”) provides functionality to other Pods (call them “frontends”) inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload? This is where Services help! By associating a Service with a set of dependent pods, you can make sure that Kubernetes automatically reconciles changes among pods so that your applications continue to function.

A Service is defined using YAML (preferred) or JSON, like all Kubernetes objects. The set of Pods targeted by a Service is usually determined by a LabelSelector.

Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. In fact, using Services, you can allow your applications to receive traffic from outside the cluster. By default however, a Service is accessible from within the cluster only. You can override this default setting using the ServiceType specification in the service definition. With the help of this specification, you can indicate where the Service should be exposed and what type of traffic (internal or external) it can receive. This means that if a Service is not up and running, then, depending upon the ServiceType, the unavailability of the Service can deny external users access to the application and can even hamper internal application operations. To assure users of continued access to their applications running in the Kubernetes cluster and to ensure peak application performance at all times, administrators should not only be able to promptly detect the non-availability of a Service, but should also be able to rapidly tell what type of Service it is and why it is not up. This is where the Services by Namespace test helps!

This test auto-discovers the Services defined within each namespace, and reports the current state, type, and age of each Service. This way, the test promptly alerts administrators if any Service is not up and running. Detailed diagnostics of the test also reveal the reason why the Service is so. Additionally, the test also reports the number and names of Pods that each Service targets and the LabelSelector used by each Service to identify the Pods. These details help in troubleshooting the abnormal state of a Service.

Target of the test : A Kubernetes/OpenShift Cluster

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each Service in every namespace configured in the Kubernetes/OpenShift cluster being monitored

First-level Descriptor: Namespace

Second-level Descriptor: Service

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the host for which this test is to be configured.

Port

Specify the port at which the specified Host listens. By default, this is 6443.

Load Balancer / Master Node IP

To run this test and report metrics, the eG agent needs to connect to the Kubernetes API on the master node and run API commands. To enable this connection, the eG agent has to be configured with either of the following:

  • If only a single master node exists in the cluster, then configure the eG agent with the IP address of the master node.
  • If the target cluster consists of more than one master node, then you need to configure the eG agent with the IP address of the load balancer that is managing the cluster. In this case, the load balancer will route the eG agent's connection request to any available master node in the cluster, thus enabling the agent to connect with the API server on that node, run API commands on it, and pull metrics.

By default, this parameter will display the Load Balancer / Master Node IP that you configured when manually adding the Kubernetes/OpenShift cluster for monitoring, using the Kubernetes Cluster Preferences page in the eG admin interface (see Figure 3). The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise?

Whenever the eG agent runs this test, it uses the IP address that is displayed (by default) against this parameter to connect to the Kubernetes API. If there is any change in this IP address at a later point in time, then make sure that you update this parameter with it, by overriding its default setting.

K8s Cluster API Prefix

By default, this parameter is set to none. Do not disturb this setting if you are monitoring a Kubernetes/OpenShift Cluster.

To run this test and report metrics for Rancher clusters, the eG agent needs to connect to the Kubernetes API on the master node of the Rancher cluster and run API commands. The Kubernetes API of Rancher clusters is of the default format: http(s)://{IP Address of kubernetes}/{api endpoints}. The Server section of the kubeconfig.yaml file downloaded from the Rancher console helps in identifying the Kubernetes API of the cluster. For e.g., https://{IP address of Kubernetes}/k8s/clusters/c-m-bznxvg4w/ is usually the URL of the Kubernetes API of a Rancher cluster.

For the eG agent to connect to the master node of a Rancher cluster and pull out metrics, the eG agent should be made aware of the API endpoints in the Kubernetes API of the Rancher cluster. To aid this, you can specify the API endpoints available in the Kubernetes API of the Rancher cluster against this parameter. In our example, this parameter can be specified as: /k8s/clusters/c-m-bznxvg4w/.

SSL

By default, the Kubernetes/OpenShift cluster is SSL-enabled. This is why, the eG agent, by default, connects to the Kubernetes API via an HTTPS connection. Accordingly, this flag is set to Yes by default.

If the cluster is not SSL-enabled in your environment, then set this flag to No.

Authentication Token

The eG agent requires an authentication bearer token to access the Kubernetes API, run API commands on the cluster, and pull metrics of interest. The steps for generating this token have been detailed in How Does eG Enterprise Monitor a Kubernetes/OpenShift Cluster?

The steps for generating this token for a Rancher cluster has been detailed in How Does eG Enterprise Monitor a Rancher Cluster?

Typically, once you generate the token, you can associate that token with the target Kubernetes/OpenShift cluster, when manually adding that cluster for monitoring using the eG admin interface. The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise?

By default, this parameter will display the Authentication Token that you provided in the Kubernetes Cluster Preferences page of the eG admin interface, when manually adding the cluster for monitoring (see Figure 3).

Whenever the eG agent runs this test, it uses the token that is displayed (by default) against this parameter for accessing the API and pulling metrics. If for any reason, you generate a new authentication token for the target cluster at a later point in time, then make sure you update this parameter with the change. For that, copy the new token and paste it against this parameter.

Report System Namespace

The kube-system namespace consists of all objects created by the Kubernetes system. Monitoring such a namespace may not only increase the eG agent's processing overheads, but may also clutter the eG database. Therefore, to optimize agent performance and to conserve database space, this test, by default, excludes the kube-system namespace from monitoring. Accordingly, this flag is set to No by default.

If required, you can set this flag to Yes, and enable monitoring of the kube-system namespace.

Proxy Host

If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the IP address of the proxy server here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,

Proxy Port

If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the port number at which that proxy server listens here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,

Proxy Username, Proxy Password, Confirm Password

These parameters are applicable only if the eG agent uses a proxy server to connect to the Kubernetes/OpenShift cluster, and that proxy server requires authentication. In this case, provide a valid user name and password against the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box.

If no proxy server is used, or if the proxy server used does not require authentication, then the default setting - none - of these parameters, need not be changed.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Service Type

Indicates the type of this Service.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
ClusterIP 1
NodePort 2
ExternalName 3
LoadBalancer 4

Each of these types have been briefly described below:

  • ClusterIP: Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
  • NodePort: Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using <NodeIP>:<NodePort>. Superset of ClusterIP.
  • ExternalName: Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.
  • LoadBalancer: Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.

Note:

By default, this test reports the Measure Values listed in the table above to indicate the Service type. In the graph of this measure however, the type is indicated using the numeric equivalents only.

Time since service creation

Indicates how old this Service is.

 

The value of this measure is expressed in number of days, hours, and minutes.

You can use the detailed diagnosis of this measure to know the Cluster IP on which the Service has been exposed, the LabelSelector using which the Service identifies the Pods, and the internal and external endpoints associated with the Service.

Total pods in service

Indicates the number of pods that this Service targets.

Number

Use the detailed diagnosis of this measure to know which Pods are targeted by the Service and which Node each Pod is running on.

Status

Indicates the current status of this Service.

 

The values that this measure reports and their corresponding numeric values are detailed in the table below:

Measure Value Numeric Value
Running 1
Pending 0

If the value of this measure is Pending, then you can use the detailed diagnosis of this measure to understand why the Service is in a Pending state.

Note:

By default, this test reports the Measure Values listed in the table above to indicate the Service status. In the graph of this measure however, the status is indicated using the numeric equivalents only.

The detailed diagnosis of the Age measure reports the service type, the cluster IP address on which the service is exposed, the internal and external endpoints of the service, and the label selector.

Figure 1 : The detailed diagnosis of the Age measure of the Services by Namespace test