K8s Persistent Volume Claims By Namespace Test

A Persistent Volume Claim (PVC) is a request for storage resources in Kubernetes, allowing pods to access persistent storage independent of the pod lifecycle. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).

A user requests storage resources by creating a PVC, specifying size, access mode, and storage class. The master node then searches for a Persistent Volume (PV) that satisfies the PVC's specified requirements. If no existing Persistent Volume (PV) matches the specifications, it dynamically provisions a new PV. Once a suitable PV is available, the PVC is bound to it, and the pod can mount the volume to access the storage. This allows the pod to mount the volume and perform read/write operations according to the specified access modes. When the PVC is deleted, the PV's Reclaim Policy determines whether the volume is retained, deleted, or recycled. When a user is done with their volume, they can delete the PVC objects from the API which allows reclamation of the resource.

If there are many unfulfilled PVCs, an administrator may quickly want to check the status of each PVC to determine why it could not be bound to a Persistent Volume (PV). Is it because the PVC’s requested storage class or size does not match any available PVs? Or are there access mode mismatches preventing binding? Is it because the PVC is stuck in a Pending state due to lack of resources or misconfiguration? Or have PVCs remained unbound because the corresponding PVs were deleted or failed to provision?The Kube Persistent Volume Claims test provides insights into these questions!

This test monitors the health and utilization of Persistent Volume Claims (PVCs) across namespaces in a Kubernetes cluster. It checks whether each PVC is in the Bound state, ensuring that requested storage is successfully provisioned. It validates the access modes to confirm that the volume permissions meet the pod’s needs. The test verifies that the storage requested and storage assigned are consistent and sufficient for workloads. It tracks used space, free space, and percent usage to detect potential risks of storage exhaustion. It also considers the time since volume claim creation to track down aged claims that might require review. This way, the test helps you to ensure optimal storage usage and avoid disruptions caused by insufficient persistent storage.

Target of the test : A Kubernetes Namespace

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each persistent volume claim on every namespace being monitored.

Configurable parameters for the test
Parameter	Description
Test Period	How often should the test be executed.
Host	The IP address of the host for which this test is to be configured.
Port	Specify the port at which the specified Host listens. By default, this is 6443.
Load Balancer / Master Node IP	To run this test and report metrics, the eG agent needs to connect to the Kubernetes API on the master node and run API commands. To enable this connection, the eG agent has to be configured with either of the following: If only a single master node exists in the cluster, then configure the eG agent with the IP address of the master node. If the target cluster consists of more than one master node, then you need to configure the eG agent with the IP address of the load balancer that is managing the cluster. In this case, the load balancer will route the eG agent's connection request to any available master node in the cluster, thus enabling the agent to connect with the API server on that node, run API commands on it, and pull metrics. By default, this parameter will display the Load Balancer / Master Node IP that you configured when manually adding the Kubernetes/OpenShift cluster for monitoring, using the Kubernetes Cluster Preferences page in the eG admin interface (see Figure 3). The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise? Whenever the eG agent runs this test, it uses the IP address that is displayed (by default) against this parameter to connect to the Kubernetes API. If there is any change in this IP address at a later point in time, then make sure that you update this parameter with it, by overriding its default setting.
SSL	By default, the Kubernetes/OpenShift cluster is SSL-enabled. This is why, the eG agent, by default, connects to the Kubernetes API via an HTTPS connection. Accordingly, this flag is set to Yes by default. If the cluster is not SSL-enabled in your environment, then set this flag to No.
K8s Cluster API Prefix	By default, this parameter is set to none. Do not disturb this setting if you are monitoring a Kubernetes/OpenShift Cluster. To run this test and report metrics for Rancher clusters, the eG agent needs to connect to the Kubernetes API on the master node of the Rancher cluster and run API commands. The Kubernetes API of Rancher clusters is of the default format: http(s)://{IP Address of kubernetes}/{api endpoints}. The Server section of the kubeconfig.yaml file downloaded from the Rancher console helps in identifying the Kubernetes API of the cluster. For e.g., https://{IP address of Kubernetes}/k8s/clusters/c-m-bznxvg4w/ is usually the URL of the Kubernetes API of a Rancher cluster. For the eG agent to connect to the master node of a Rancher cluster and pull out metrics, the eG agent should be made aware of the API endpoints in the Kubernetes API of the Rancher cluster. To aid this, you can specify the API endpoints available in the Kubernetes API of the Rancher cluster against this parameter. In our example, this parameter can be specified as: /k8s/clusters/c-m-bznxvg4w/.
Authentication Token	The eG agent requires an authentication bearer token to access the Kubernetes API, run API commands on the cluster, and pull metrics of interest. The steps for generating this token have been detailed in How Does eG Enterprise Monitor a Kubernetes/OpenShift Cluster? Typically, once you generate the token, you can associate that token with the target Kubernetes/OpenShift cluster, when manually adding that cluster for monitoring using the eG admin interface. The steps for managing the cluster using the eG admin interface are discussed elaborately in How to Monitor the Kubernetes/OpenShift Cluster Using eG Enterprise? By default, this parameter will display the Authentication Token that you provided in the Kubernetes Cluster Preferences page of the eG admin interface, when manually adding the cluster for monitoring (see Figure 3). Whenever the eG agent runs this test, it uses the token that is displayed (by default) against this parameter for accessing the API and pulling metrics. If for any reason, you generate a new authentication token for the target cluster at a later point in time, then make sure you update this parameter with the change. For that, copy the new token and paste it against this parameter.
Namespace to Monitor	To enable the eG agent to monitor a specific Namespace on Kubernetes/OpenShift cluster, specify the name of that Namespace against this parameter. For instance, eshop. Doing so will enable the eG agent to monitor and report metrics specific to this Namespace.
Proxy Host	If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the IP address of the proxy server here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,
Proxy Port	If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the port number at which that proxy server listens here. If no proxy is used, then the default setting -none - of this parameter, need not be changed,
Proxy Username, Proxy Password, Confirm Password	These parameters are applicable only if the eG agent uses a proxy server to connect to the Kubernetes/OpenShift cluster, and that proxy server requires authentication. In this case, provide a valid user name and password against the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box. If no proxy server is used, or if the proxy server used does not require authentication, then the default setting - none - of these parameters, need not be changed.
Kubernetes version	The Version text box indicates the version of the Kubernetes/OpenShift cluster to be managed. The default value is none. If the value of this parameter is not "none", the test uses the value provided (e.g., 28.1) as the Kubernetes version.
Timeout	Specify the duration (in seconds) for which this test should wait for a response from the Kubernetes/OpenShift cluster. If there is no response from the cluster beyond the configured duration, the test will timeout. By default, this is set to 5 seconds.
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 3:1. This indicates that, by default, detailed measures will be generated every third time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Status

Indicates the current status of this Persistent Volume Claim.

This measure can report any of the following values:

Available: A free resource that is yet bound to a claim
Bound: The volume is bound to a claim
Released: The claim has been deleted, but the resource is not reclaimed by the cluster. This depends upon the reclaim policy of the PV. For instance, if the reclaim policy is Retain, then the cluster will not automatically reclaim the resource once it is released; it can only be manually reclaimed.
Failed: The volume has failed its automatic reclamation.
Pending:The PVC has been created but is still waiting to be matched to a suitable Persistent Volume (PV).

The numeric values that correspond to these measure values are as follows:

Measure Value	Numeric Value
Available	1
Bound	2
Released	3
Failed	4
Pending	5

Note:

By default, this test reports the Measure Values listed in the table above to indicate the state of the PVC. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Using the detailed diagnosis of this measure, you can find out the Volume Name, Reclaim Policy, Storage Class, Volume mode, Pod Name, and Node name.

Time since volume claim created

Indicates how old this PVC is.

The value of this measure is expressed in number of days, hours, and minutes.

Access modes

Indicates the access modes configured for this PVC.

When you create a Persistent Volume Claim (PVC), you can request specific access modes - which define how the volume can be mounted and accessed by the pods. As shown in the table below, providers will have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.

The access modes are:

ReadWriteOnce – the volume can be mounted as read-write by a single node
ReadOnlyMany – the volume can be mounted read-only by many nodes
ReadWriteMany – the volume can be mounted as read-write by many nodes

The aforesaid access modes also represent the values that this measure can report. The numeric values that correspond to these measure values are as follows:

Measure Value	Numeric Value
ReadOnlyMany	1
ReadWriteMany	2
ReadWriteOnce	3

Note:

By default, this test reports the Measure Values listed in the table above to indicate the access mode of a PVC. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Storage requested

Indicates the amount of storage space that this PVC requested when it was created.

This measure the shows size of storage that a PVC demands from Kubernetes to provision or bind for use by a pod.

Storage assigned

Indicates the amount of storage that was allocated to this PVC after it was successfully bound to a Persistent Volume (PV).

Total capacity

Indicates the full storage size available on the Persistent Volume for binding to this PVC.

Used space

Indicates the amount of storage that has been consumed by the pods using this PVC.

A value close to the Total Capacity measure indicates that the PV is currently running out of space.

Free space

Indicates the amount of storage still available for use on the volume bound to this PVC.

A high value is desired for this measure.

Percent usage

Indicates the percentage of the total storage capacity of this PVC that has been consumed.

Percent

A value close to 100 percent indicates that the PVC is running out of space.

Compare the value across Persistent Volume Claims to identify the Persistent Volume Claim that is frequently running out of space.