API Server Status Test

The API Server status indicates the health and operational state of the Kubernetes API Server, the central component of the control plane. It manages cluster requests, interacts with etcd, and ensures resource synchronization. Monitoring its status helps detect failures, performance issues, and ensures the cluster's seamless operation.

Monitoring the API Server status is crucial to ensure the health of the Kubernetes control plane. It helps detect failures, performance degradation, or resource constraints, preventing disruptions in cluster operations. Timely monitoring enables proactive issue resolution, ensuring seamless communication between components and maintaining cluster availability and performance.

The API Server Status Test continuously monitors the API Server in the target node and reports key detailing the events, requests, queues and authentication. These metrics are invaluable for the administrators to ensure that service is up and prevent any issues in case service has problems.

Target of the test : A Kubernetes Master Node

Agent deploying the test : An internal agent

Outputs of the test : One set of results for the target Kubernetes Master node being monitored

Configurable parameters for the test
Parameter	Description
Test Period	How often should the test be executed.
Host	The IP address of the host for which this test is to be configured.
Port	Specify the port at which the specified Host listens. By default, this is 6443.
Timeout	Specify the duration (in seconds) beyond which the test will timeout in the Timeout text box. The default value is 10 seconds.
Metric URL	Each of the Kubernetes system components expose monitoring metrics through /metrics endpoint of the HTTP server. For components that don't expose endpoint by default, refer official Kubernetes distribution documentation site. Specify the metric URL textbox.

Measurements made by the test
Measurement	Description	Measurement Unit	Interpretation
Number of watch event size distribution	Indicates the number of distributions created for watch event size.	Number	Large watch events consume more network bandwidth and memory, potentially overloading the API Server and clients.
Total watch event size distribution	Indicate the total number of watch even size distributions.	Number
Average watch event size distribution	Indicates the average number of watch event size distribution.	Number
Max limit of used inflight request	Indicates the number of requests that the API Server can handle concurrently.	Number	This limit ensures that the API Server does not get overwhelmed by excessive requests, maintaining stable performance.
Long running apiserver requests	Indicates the number of requests running for long time.	Number	Prolonged request processing times can lead to delays in deployments, scaling, or monitoring, affecting cluster responsiveness.
Number of request latency	Indicates the number of times the time taken by the Kubernetes API Server to process API requests, was measured.	Number	Monitoring this latency provides insights into the performance of the API Server and its ability to handle workloads efficiently.
Total request latency	Indicates the total time taken by the Kubernetes API Server to process API all requests.	Milliseconds	The API Server exposes metrics that measure latency for different types of requests.
Average request latency	Indicates the average time taken by the Kubernetes API Server to process API requests across all requests.	Milliseconds
Requests apiserver terminated in self-defense	Indicates the number of requests which are terminated by API Server to improve performace.	Number	If the number of concurrent requests surpasses the limits set by --max-requests-inflight or --max-mutating-requests-inflight, the API Server rejects additional requests with a 429 Too Many Requests error.
Request received by deprecated APIs	Indicates the API requests made to versions or endpoints that have been marked as deprecated	Number
Authenticated requests	Indicates the number of authentication requests sent to the API Server by clients.	Number
Authenticated request attempts	Indicates the number of authentication requests attempts made to the API Server by clients.	Number
Number of request	Indicates the total number of requests received by API server during the last measurement period.	Number	If the number of requests is very high, it can cause some of the requests getting dropped.
Total request duration	Indicates the cumulative time taken by the Kubernetes API Server to process all incoming API requests during the last measurement period.	Milliseconds	Helps assess whether the API Server is responding to requests within acceptable time limits and if performance tuning is needed.
Average request duration	Indicates the mean time taken by the Kubernetes API Server to process individual API requests.	Milliseconds	A high average request duration indicates potential bottlenecks in the API Server, etcd, or networking layers.
HTTP requests	Indicates the total number of HTTP requests received by API Server.	Milliseconds
Depth of workqueue	Indicates the number of items currently present in the workqueue,	Number
Number of additions handled by work queue	Indicates the count of items (e.g., resource events) added to a controller’s workqueue for processing.	Number	High numbers of additions indicate frequent resource updates or events, which may increase controller load.
Number of authentication	Indicates the total number of authentication happened during the last measurement period.	Number
Total authentication duration	Indicates the total time taken for the authentication by API Server during the last measurement period.	Milliseconds
Average authentication duration	Indicates the average time taken by each authentication request.	Milliseconds
API Services which are marked as unavailable	Indicates the number of services which are marked as unavailable by API server after health checks.	Number	The API Server periodically checks the health of API services. If a service fails to respond or returns errors, it is marked as unavailable.